Explore, Exploit, and Explain: Role of Explanation & Attribution in Multi-stakeholder Marketplaces
The multi-armed bandit is an important framework for balancing exploration with exploitation in recommendation. Exploitation recommends content (e.g., products, movies, music playlists) with the highest predicted user engagement and has traditionally been the focus of recommender systems. Exploration recommends content with uncertain predicted user engagement for the purpose of gathering more information. In parallel, explaining recommendations (“recsplanations”) is crucial if users are to understand their recommendations. Existing work has looked at bandits and explanations independently. We provide the first method that combines both in a principled manner. In particular, our method is able to jointly (1) learn which explanations each user responds to; (2) learn the best content to recommend for each user; and (3) balance exploration with exploitation to deal with uncertainty. Towards the end, we allude to recent advances in multi-objective modeling and outline key issues around explanations & attribution in multi-objective recommendations.
Rishabh Mehrotra is a Senior Research Scientist at Spotify Research in London. He obtained his PhD in the field of Machine Learning and Information Retrieval from University College London where he was partially supported by a Google Research Award. His PhD research focused on inference of search tasks from search & conversational interaction logs. His current research focuses on machine learning for marketplaces, bandit based recommendations, counterfactual analysis and experimentation. Some of his recent work has been published at conferences including KDD, WWW, SIGIR, NAACL, RecSys and WSDM. He has co-taught a number of tutorials at leading conferences, and summer schools.