Speaker: Yisong Yue
Carnegie Mellon University
Title: Optimizing Recommender Systems as a Submodular Bandits Problem
User feedback has become an invaluable source of training data for optimizing recommender systems in a rapidly expanding range of domains, most notably content recommendation (e.g., news, movies, ads). When designing recommender systems that adapt to user feedback, two important challenges arise. First, the system should recommend optimally diversii- ed content that maximizes coverage of the information the user i- nds interesting (to maximize positive feedback). Second, the system should make appropriate exploratory recommendations in order to learn a reliable model from feedback. In this talk, I will describe the Linear Submodular Bandits Problem, which is a framework for jointly modeling the utility of a set of recommendations (so as to encourage diversity), as well as the exploration/exploitation trade-off that arises when learning from user feedback. In particular, the utility of a set of recommendations is modeled as a parameterized submodular function, which naturally encodes a notion of diminishing returns that encourages diversity. For this setting, I will also present an online learning algorithm that can efficiently converge to a near-optimal model.
As with any bandit learning problem, the inefficiency (or regret) of a recommendation algorithm is due primarily to the cost of exploration (i.e., making exploratory recommendations due to not knowing the user's preferences a priori). One way to reduce the cost of exploration is by leveraging prior knowledge. Intuitively, most users bear some similarity to "stereotypical users" that can be represented in a low-dimensional, or coarse, feature space. I will show how to construct a coarse-to-fine hierarchy of feature spaces from the preference profiles of existing users, and also how to conduct bandit learning using this feature hierarchy to drastically reduce the amount of exploration required.
I will present a live user study, where these approaches were applied to the setting of personalized news recommendation. Our results demonstrate improved performance against approaches that do not directly model diversification, do not employ exploration, or do not incorporate prior knowledge to reduce the amount of exploration required.
This is joint work with Carlos Guestrin and Sue Ann Hong.