• Corpus ID: 235795003

SVP-CF: Selection via Proxy for Collaborative Filtering Data

  title={SVP-CF: Selection via Proxy for Collaborative Filtering Data},
  author={Noveen Sachdeva and Carole-Jean Wu and Julian McAuley},
We study the practical consequences of dataset sampling strategies on the performance of recommendation algorithms. Recommender systems are generally trained and evaluated on samples of larger datasets. Samples are often taken in a naı̈ve or ad-hoc fashion: e.g. by sampling a dataset randomly or by selecting users or items with many interactions. As we demonstrate, commonly-used data sampling schemes can have significant consequences on algorithm performance—masking performance deficiencies in… 

Figures and Tables from this paper

DeepCore: A Comprehensive Library for Coreset Selection in Deep Learning

Extensive experiment results show that, although some methods perform better in certain experiment settings, random selection is still a strong baseline for coreset selection in deep learning.


The carbon footprint of AI computing is characterized by examining the model development cycle across industry-scale machine learning use cases and, at the same time, considering the life cycle of system hardware.

Sustainable AI: Environmental Implications, Challenges and Opportunities

The carbon footprint of AI computing is characterized by examining the model development cycle across industry-scale machine learning use cases and, at the same time, considering the life cycle of system hardware.



On Sampling Strategies for Neural Network-based Collaborative Filtering

A general neural network-based recommendation framework is proposed, which subsumes several existing state-of-the-art recommendation algorithms, and the efficiency issue is addressed by investigating sampling strategies in the stochastic gradient descent training for the framework.

Self-Attentive Sequential Recommendation

Extensive empirical studies show that the proposed self-attention based sequential model (SASRec) outperforms various state-of-the-art sequential models (including MC/CNN/RNN-based approaches) on both sparse and dense datasets.

BPR: Bayesian Personalized Ranking from Implicit Feedback

This paper presents a generic optimization criterion BPR-Opt for personalized ranking that is the maximum posterior estimator derived from a Bayesian analysis of the problem and provides a generic learning algorithm for optimizing models with respect to B PR-Opt.

Debiasing Item-to-Item Recommendations With Small Annotated Datasets

This paper develops a principled approach for item-to-item recommendation based on causal inference and presents a practical and highly effective method for estimating the causal parameters from a small annotated dataset.

Are we really making much progress? A worrying analysis of recent neural recommendation approaches

A systematic analysis of algorithmic proposals for top-n recommendation tasks that were presented at top-level research conferences in the last years sheds light on a number of potential problems in today's machine learning scholarship and calls for improved scientific practices in this area.

Variational Autoencoders for Collaborative Filtering

A generative model with multinomial likelihood and use Bayesian inference for parameter estimation is introduced and the pros and cons of employing a principledBayesian inference approach are identified and characterize settings where it provides the most significant improvements.

Neural Collaborative Filtering

This work strives to develop techniques based on neural networks to tackle the key problem in recommendation --- collaborative filtering --- on the basis of implicit feedback, and presents a general framework named NCF, short for Neural network-based Collaborative Filtering.

Graph Convolutional Neural Networks for Web-Scale Recommender Systems

A novel method based on highly efficient random walks to structure the convolutions and a novel training strategy that relies on harder-and-harder training examples to improve robustness and convergence of the model are developed.

Off-policy Bandits with Deficient Support

This work systematically analyzed the statistical and computational properties of three approaches that provide various guarantees for IPS-based learning despite the inherent limitations of support-deficient data: restricting the action space, reward extrapolation, and restricting the policy space.

Data Mining Methods for Recommender Systems

In this chapter, an overview of the main Data Mining techniques used in the context of Recommender Systems is given, including Bayesian Networks and Support Vector Machines.