• Corpus ID: 46860212

All The Cool Kids, How Do They Fit In?: Popularity and Demographic Biases in Recommender Evaluation and Effectiveness

@inproceedings{Ekstrand2018AllTC,
  title={All The Cool Kids, How Do They Fit In?: Popularity and Demographic Biases in Recommender Evaluation and Effectiveness},
  author={Michael D. Ekstrand and Mucun Tian and Ion Madrazo Azpiazu and Jennifer D. Ekstrand and Oghenemaro Anuyah and David McNeill and Maria Soledad Pera},
  booktitle={FAT},
  year={2018}
}
In the research literature, evaluations of recommender system effectiveness typically report results over a given data set, providing an aggregate measure of effectiveness over each instance (e.g. user) in the data set. Recent advances in information retrieval evaluation, however, demonstrate the importance of considering the distribution of effectiveness across diverse groups of varying sizes. For example, do users of different ages or genders obtain similar utility from the system… 
Revisiting Popularity and Demographic Biases in Recommender Evaluation and Effectiveness
TLDR
It is found that total usage and the popularity of consumed content are strong predictors of recommender performance and also vary significantly across demographic groups, and that the utility is higher for users from countries with more representation in the dataset.
The Effect of Algorithmic Bias on Recommender Systems for Massive Open Online Courses
TLDR
This paper compares existing algorithms and their recommended lists against biases related to course popularity, catalog coverage, and course category popularity, and remarks even more the need of better understanding how recommenders react against bias in diverse contexts.
Analyzing Item Popularity Bias of Music Recommender Systems: Are Different Genders Equally Affected?
TLDR
This work investigates the algorithmic popularity bias of seven common recommendation algorithms (five collaborative filtering and two baselines) and focuses on music recommendation and conducts experiments on the recently released standardized LFM-2b dataset, containing listening profiles of Last.fm users.
The Relationship between the Consistency of Users' Ratings and Recommendation Calibration
TLDR
There is a positive correlation between the consistency of users' ratings behavior and the degree of calibration in their recommendations, meaning that user groups with higher inconsistency in their ratings receive less calibrated recommendations.
Exploring author gender in book rating and recommendation
TLDR
It is found that common collaborative filtering algorithms tend to propagate at least some of each user’s tendency to rate or read male or female authors into their resulting recommendations, although they differ in both the strength of this propagation and the variance in the gender balance of the recommendation lists they produce.
Popularity Bias in Recommendation: A Multi-stakeholder Perspective
TLDR
This dissertation, which studies the impact of popularity bias in recommender systems from a multi-stakeholder perspective, proposes several algorithms each approaching the popularity bias mitigation from a different angle and compares their performances using several metrics with some other state-of-the-art approaches in the literature.
The Connection Between Popularity Bias, Calibration, and Fairness in Recommendation
TLDR
There is a connection between how different user groups are affected by algorithmic popularity bias and their level of interest in popular items, and a metric called miscalibration is used for measuring how a recommendation algorithm is responsive to users’ true preferences.
Exploring author gender in book rating and recommendation
TLDR
This work measures the distribution of the genders of the authors of books in user rating profiles and recommendation lists produced from this data to find that common collaborative filtering algorithms differ in the gender distribution of their recommendation lists, and in the relationship of that output distribution to user profile distribution.
Investigating Potential Factors Associated with Gender Discrimination in Collaborative Recommender Systems
TLDR
Several characteristics of user profiles are studied to analyze their possible associations with disparate behavior of the system towards different genders and show that women get less accurate recommendations than men indicating an unfair nature of recommendation algorithms across genders.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 36 REFERENCES
The Comparability of Recommender System Evaluations and Characteristics of Docear ’ s Users
TLDR
This paper shows that reporting demographic and usage-based data is crucial in order to create meaningful evaluations on Docear's recommender system and sets previous evaluations into context and helps others to compare their results with the authors'.
A Comparison of How Demographic Data Affects Recommendation
TLDR
A comparison of recommendation results when using different demographic features commonly available in online movie recommendation communities, and results of a simple method that extends standard collaborative filtering algorithms to include one or several of these features.
Evaluating Recommendation Systems
TLDR
This paper discusses how to compare recommenders based on a set of properties that are relevant for the application, and focuses on comparative studies, where a few algorithms are compared using some evaluation metric, rather than absolute benchmarking of algorithms.
Explaining the user experience of recommender systems
TLDR
This paper proposes a framework that takes a user-centric approach to recommender system evaluation that links objective system aspects to objective user behavior through a series of perceptual and evaluative constructs (called subjective system aspects and experience, respectively).
Performance prediction and evaluation in recommender systems: An information retrieval perspective
TLDR
This thesis investigates the definition and formalisation of performance predic-tion methods for recommender systems, and evaluates the quality of the proposed solutions in terms of the correlation between the predicted and the observed performance on test data.
A Survey of Accuracy Evaluation Metrics of Recommendation Tasks
TLDR
This paper reviews the proper construction of offline experiments for deciding on the most appropriate algorithm, and discusses three important tasks of recommender systems, and classify a set of appropriate well known evaluation metrics for each task.
Performance of recommender algorithms on top-n recommendation tasks
TLDR
An extensive evaluation of several state-of-the art recommender algorithms suggests that algorithms optimized for minimizing RMSE do not necessarily perform as expected in terms of top-N recommendation task, and new variants of two collaborative filtering algorithms are offered.
Evaluating collaborative filtering recommender systems
TLDR
The key decisions in evaluating collaborative filtering recommender systems are reviewed: the user tasks being evaluated, the types of analysis and datasets being used, the ways in which prediction quality is measured, the evaluation of prediction attributes other than quality, and the user-based evaluation of the system as a whole.
Precision-oriented evaluation of recommender systems: an algorithmic comparison
TLDR
In three experiments with three state-of-the-art recommenders, four of the evaluation methodologies are consistent with each other and differ from error metrics, in terms of the comparative recommenders' performance measurements.
Hybrid Recommender Systems: Survey and Experiments
  • R. Burke
  • Computer Science
    User Modeling and User-Adapted Interaction
  • 2004
TLDR
This paper surveys the landscape of actual and possible hybrid recommenders, and introduces a novel hybrid, EntreeC, a system that combines knowledge-based recommendation and collaborative filtering to recommend restaurants, and shows that semantic ratings obtained from the knowledge- based part of the system enhance the effectiveness of collaborative filtering.
...
1
2
3
4
...