Novelty and Diversity Enhancement and Evaluation in Recommender Systems
Saúl Vargas, April 2012
Novelty and diversity as relevant dimensions of retrieval quality are receiving increasing attention in the Information Retrieval and Recommender Systems fields. Both problems have nonetheless been approached under different views and formulations in IR and RS respectively, giving rise to different models, methodologies, and metrics, with little convergence between both fields. We find considerable room for research towards the formalization of diversification methods, evaluation methodologies, and metrics. Furthermore, we ask ourselves whether there should be some natural connection between the perspectives on diversity in IR and RS, given that recommendation is after all an information retrieval problem.
In the present work we propose an Information Retrieval approach to the evaluation and enhacement of novelty and diversity in Recommender Systems. We draw models and solutions from text retrieval and apply them to recommendation tasks in such a way that the recent advances achieved in the former can be leveraged for the latter. We also propose a new formalization and unification of the way novelty and diversity are evaluated on RS, considering rank and relevance as additional and meaningful aspects for the evaluation of recommendation lists. We propose a framework that includes and unifies the main state of the art metrics for novelty and diversity in RS, generalizing and extending them with further properties and flexibility in configuration. Our contributions are tested with standard RS collections, in order to validate our proposals and provide further insights.
Full text (pdf)
Temporal Models in Recommender Systems: An Exploratory Study on Different Evaluation Dimensions
Pedro G. Campos, June 2011
A Recommender System (RS) is a computer program able to identify specific objects for different user interests. Given that many RS have been operating for years, temporal information as a source to obtain better recommendations is acquiring more importance. The currently active research field of RS has tried to incorporate this information in the form of new recommendation algorithms. However, a common evaluation framework for testing improvements in this area is still missing; most proposals have been developed for and tested under specific (and different) datasets, circumstances and metrics, making it difficult to fairly compare them.
This work aims to help establish a better perspective on the impact of techniques that deal with temporal information in RS. An evaluation protocol scheme is developed, in order to allow the usage of the different techniques under a common experimental setting, considering two common recommendation tasks (rating prediction and top-N recommendation). We assess the recommendations’ results obtained with different recommendation algorithms on five different evaluation dimensions (statistical accuracy, decision support accuracy, novelty, diversity and coverage), including six different metrics (RMSE, Precision, AUC, Self-Information, Intra List Similarity and Interest Coverage).
Results show that, differently to what could be expected, not all time-aware algorithms are able to outperform their time-unaware counterparts, in particular with respect to accuracy on rating prediction (statistical accuracy), which is somewhat unexpected given that, in general, the main motivation for the elaboration of such extensions is accuracy increase. Moreover, the behavior of algorithms on the assessed metrics varies notably, and no particular technique can be considered as “the best” across the different evaluation dimensions. These findings stress 1) the importance of establishing a common and rigorous evaluation scheme when different algorithms are to be compared and 2) the most suitable recommendation algorithm will depend on the particular task at hand and the evaluation dimension of interest.
Full text (pdf)
Performance prediction in recommender systems: application to the dynamic optimisation of aggregative methods
Alejandro Bellogín, July 2009
Performance prediction has gained increasing attention in the Information Retrieval (IR) field since the half of the past decade and has now become an established research topic in the field. Predicting the performance of an IR system, subsystem, module, function, or input, enables an array of dynamic optimisation strategies which select at runtime the option which is predicted to work best in a particular situation, or adjust on the fly its participation as part of a larger system or a hybrid approach. The present work restates the problem in the subarea of Recommender Systems (RS) where it has barely been addressed so far. We research meaningful definitions of performance in the context of RS, and the elements to which it can sensibly apply. We take as a driving direction the application of performance prediction to achieve improvements in specific combination problems in the RS field. We formalise the notion of performance prediction in specific terms within this frame, and we investigate the potential adaptation of performance predictors defined in other areas of IR (mainly query performance in ad hoc retrieval), as well as the definition of new ones based on theories and tools from Information Theory. The proposed methods are tested empirically with positive results, finding four predictors which outperform standard algorithms at all sparsity levels, two of them showing significant correlation with performance measures.
Full text (pdf)