SimuRec: Workshop on Synthetic Data and Simulation Methods for Recommender Systems Research

  title={SimuRec: Workshop on Synthetic Data and Simulation Methods for Recommender Systems Research},
  author={Michael D. Ekstrand and Allison June-Barlow Chaney and Pablo Castells and Robin D. Burke and David Rohde and Manel Slokom},
  journal={Fifteenth ACM Conference on Recommender Systems},
There is significant interest lately in using synthetic data and simulation infrastructures for various types of recommender systems research. However, there are not currently any clear best practices around how best to apply these methods. We proposed a workshop to bring together researchers and practitioners interested in simulating recommender systems and their data to discuss the state of the art of such research and the pressing open methodological questions. The workshop resulted in a… 
Recommendation Fairness: From Static to Dynamic
This paper portrays the recent developments in recommender systems first and discusses how fairness could be baked into the reinforcement learning techniques for recommendation, and argues that in order to make further progress in recommendation fairness, the general framework of stochastic games may want to consider multi-agent (game-theoretic) optimization, multi-objective (Pareto) optimized, and simulationbased optimization.


Comparing recommender systems using synthetic data
In this work, we propose SynRec, a data protection framework that uses data synthesis. The goal is to protect sensitive information in the user-item matrix by replacing the original values with
Empirical Analysis of Attribute-Aware Recommender System Algorithms Using Synthetic Data
A reasonably good overview of the behavior of attribute-aware algorithms can be obtained by using synthetic data compared to results done with real-life datasets, as well as variable synthetic data to observe their behavior as the characteristic of data varies.
Data Masking for Recommender Systems: Prediction Performance and Rating Hiding
The experimental results demonstrate that the relative performance of algorithms, which is the key property that a data science challenge must measure, is comparable between the original data and the data masked with Shuffle-NNN.
RecoGym: A Reinforcement Learning Environment for the problem of Product Recommendation in Online Advertising
RecoGym is introduced, an RL environment for recommendation, which is defined by a model of user traffic patterns on e-commerce and the users response to recommendations on the publisher websites, that could open up an avenue of collaboration between the recommender systems and reinforcement learning communities and lead to better alignment between offline and online performance metrics.
Should I Follow the Crowd?: A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems
A crowdsourced dataset devoid of the usual biases displayed by common publicly available data is built, in which contradictions between the accuracy that would be measured in a common biased offline experimental setting, and the actual accuracy that can be measured with unbiased observations are illustrated.
Synthetic Attribute Data for Evaluating Consumer-side Fairness
The Frequency-Linked Attribute Generation (FLAG) algorithm is described, and its applicability for assigning synthetic demographic attributes to recommendation data sets is shown.
How algorithmic confounding in recommendation systems increases homogeneity and decreases utility
Using simulations, it is demonstrated how using data confounded in this way homogenizes user behavior without increasing utility.
Estimating Error and Bias in Offline Evaluation Results
It is found that missing data in the rating or observation process causes the evaluation protocol to systematically mis-estimate metric values, and in some cases erroneously determine that a popularity-based recommender outperforms even a perfect personalized recommender.