Precision-oriented evaluation of recommender systems: an algorithmic comparison

@inproceedings{Bellogn2011PrecisionorientedEO,
  title={Precision-oriented evaluation of recommender systems: an algorithmic comparison},
  author={Alejandro Bellog{\'i}n and P. Castells and Iv{\'a}n Cantador},
  booktitle={RecSys '11},
  year={2011}
}
There is considerable methodological divergence in the way precision-oriented metrics are being applied in the Recommender Systems field, and as a consequence, the results reported in different studies are difficult to put in context and compare. We aim to identify the involved methodological design alternatives, and their effect on the resulting measurements, with a view to assessing their suitability, advantages, and potential shortcomings. We compare five experimental methodologies, broadly… Expand

Figures, Tables, and Topics from this paper

Evaluating the Relative Performance of Collaborative Filtering Recommender Systems
TLDR
An evaluation framework based on a set of accuracy and beyond accuracy metrics, including a novel metric that captures the uniqueness of a recommendation list is presented, which finds that the matrix factorisation approach leads to more accurate and diverse recommendations, while being less biased toward popularity. Expand
Evaluating Decision-Aware Recommender Systems
TLDR
This work analyses how the recommender system could measure the confidence on its own recommendations, so it has the capability of taking decisions about whether an item should be recommended or not, and explores evaluation metrics that allow to combine more than one evaluation dimension. Expand
Comparative recommender system evaluation: benchmarking recommendation frameworks
TLDR
This work compares common recommendation algorithms as implemented in three popular recommendation frameworks and shows the necessity of clear guidelines when reporting evaluation of recommender systems to ensure reproducibility and comparison of results. Expand
Statistical biases in Information Retrieval metrics for recommender systems
TLDR
This paper lays out an experimental configuration framework upon which to identify and analyse specific statistical biases arising in the adaptation of Information Retrieval metrics to recommendation tasks, namely sparsity and popularity biases. Expand
Assessing ranking metrics in top-N recommendation
TLDR
A principled analysis of the robustness and the discriminative power of different ranking metrics for the offline evaluation of recommender systems is undertaken, drawing from previous studies in the information retrieval field. Expand
A Top-N Recommender System Evaluation Protocol Inspired by Deployed Systems
TLDR
The results show that the proposed model can better capture the quality of a recommender system than traditional evaluation does, and is not affected by characteristics of the data (e.g. size). Expand
Mix and Rank: A Framework for Benchmarking Recommender Systems
TLDR
This work proposes a novel benchmarking framework that mixes different evaluation measures in order to rank the recommender systems on each benchmark dataset, separately, and discovers sets of correlated measures as well as sets of evaluation measures that are least correlated. Expand
Evaluating Recommender Systems: A Systemized Quantitative Survey
Replicating the results of the recommender system's evaluation is one of the main concerns in the area. This paper discusses this issue from different angles: 1) It investigates the uniformity ofExpand
New approaches for evaluation: correctness and freshness: Extended Abstract
TLDR
A family of metrics that combine precision and coverage in a principled manner are presented and a measure to account for how much a system is promoting fresh items in its recommendations (freshness) is provided. Expand
Goal-driven collaborative filtering
TLDR
This thesis intends to study recommender systems from a goal-oriented point of view, where the recommendation goals are defined, their associated measures are built and a unified error minimisation framework is proposed that flexibly covers various (directional) risk preferences. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 14 REFERENCES
Evaluating collaborative filtering recommender systems
TLDR
The key decisions in evaluating collaborative filtering recommender systems are reviewed: the user tasks being evaluated, the types of analysis and datasets being used, the ways in which prediction quality is measured, the evaluation of prediction attributes other than quality, and the user-based evaluation of the system as a whole. Expand
Being accurate is not enough: how accuracy metrics have hurt recommender systems
TLDR
This paper proposes informal arguments that the recommender community should move beyond the conventional accuracy metrics and their associated experimental methodologies, and proposes new user-centric directions for evaluating recommender systems. Expand
Goal-Driven Collaborative Filtering - A Directional Error Based Approach
TLDR
This paper proposes a flexible optimization framework that can adapt to individual recommendation goals and introduces a Directional Error Function to capture the cost (risk) of each individual predictions, and it can be learned from the specified performance measures at hand. Expand
Performance of recommender algorithms on top-n recommendation tasks
TLDR
An extensive evaluation of several state-of-the art recommender algorithms suggests that algorithms optimized for minimizing RMSE do not necessarily perform as expected in terms of top-N recommendation task, and new variants of two collaborative filtering algorithms are offered. Expand
Evaluating Recommendation Systems
TLDR
This paper discusses how to compare recommenders based on a set of properties that are relevant for the application, and focuses on comparative studies, where a few algorithms are compared using some evaluation metric, rather than absolute benchmarking of algorithms. Expand
Factorization meets the neighborhood: a multifaceted collaborative filtering model
TLDR
The factor and neighborhood models can now be smoothly merged, thereby building a more accurate combined model and a new evaluation metric is suggested, which highlights the differences among methods, based on their performance at a top-K recommendation task. Expand
Optimizing multiple objectives in collaborative filtering
TLDR
A general recommendation optimization framework that not only considers the predicted preference scores but also deals with additional operational or resource related recommendation goals and demonstrates through realistic examples how to expand existing rating prediction algorithms by biasing the recommendation depending on other external factors such as the availability, profitability or usefulness of an item. Expand
Text Retrieval Methods for Item Ranking in Collaborative Filtering
TLDR
A common notational framework for IR and rating-based CF, as well as a technique to provide CF data with a particular structure, in order to be able to use any IR weighting function with it, are proposed. Expand
kNN CF: a temporal social network
TLDR
In this work, user-user kNN graphs are analysed from a temporal perspective, retrieving characteristics such as dataset growth, the evolution of similarity between pairs of users, the volatility of user neighbourhoods over time, and emergent properties of the entire graph as the algorithm parameters change. Expand
A collaborative filtering algorithm and evaluation metric that accurately model the user experience
TLDR
It is empirically demonstrated that two of the most acclaimed CF recommendation algorithms have flaws that result in a dramatically unacceptable user experience, and a new Belief Distribution Algorithm is introduced that overcomes these flaws and provides substantially richer user modeling. Expand
...
1
2
...