Carlos Gomez-Uribe, Director of Product Innovation, Personalization technologies at Netflix

Challenges and Limitations in the Offline and Online Evaluation of Recommender Systems: A Netflix Case Study

The typical use case of recommendation systems is suggesting items such as videos, songs or articles to users. Evaluating a recommender system is critical to the process of improving it. In theory the best judges of the quality and effectiveness of a recommender system are the users themselves, e.g., ideal metrics can describe the intensity and frequency of a user's interaction with the system over the long term. In practice, however, despite the wide adoption of consumer science based on online A/B testing for the evaluation and comparison of different recommender systems, user-derived measurements are often noisy, slow, non-repeatable, and sensitive to a myriad of potential confounders. Furthermore, conducting large-scale user experiments for researchers in academia is often impossible. A complementary offline approach can be used to quickly evaluate and optimize new recommender systems on historical user-generated data. Yet these offline measurements need not translate directly onto the sought-after online results, such as increases in user engagement. This talk will describe the blend of offline and online experimentation we use at Netflix to improve upon our recommendation systems, and will discuss some key challenges and limitations of these approaches that are broadly relevant to the recommender systems field.

Carlos Gomez Uribe is Director of Product Innovation at Netflix, where he leads multiple teams of scientists and engineers focused on improving the recommendations system that connects our members with videos they will enjoy. Prior to Netflix, Carlos spent two years at Google working on the web search algorithm. Carlos received his PhD in Medical and Electrical Engineering from the Massachusetts Institute of Technology (MIT) and Harvard in 2008. He also holds a Master of Engineering degree in Electrical Engineering and Computer Science, and Bachelor of Science degrees in Mathematics and in Electrical Engineering and Computer Science, all from MIT. Carlos's research interests revolve around networked stochastic dynamical systems, such as recommendations systems and signaling networks in Biology.

RUE 2012 - 1st International Workshop on Recommendation Utility Evaluation: Beyond RMSE
6th ACM Conference on Recommender Systems (RecSys 2012)
Dublin, Ireland, 9 or 13 September 2012