Common Pitfalls in Training and Evaluating Recommender Systems

@article{Chen2017CommonPI,
  title={Common Pitfalls in Training and Evaluating Recommender Systems},
  author={Hung-Hsuan Chen and Chu-An Chung and Hsin-Chien Huang and Wen Tsui},
  journal={SIGKDD Explor.},
  year={2017},
  volume={19},
  pages={37-45}
}
This paper formally presents four common pitfalls in training and evaluating recommendation algorithms for information systems. Specifically, we show that it could be problematic to separate the server logs into training and test data for model generation and model evaluation if the training and the test data are selected improperly. In addition, we show that click through rate { a common metric to measure and compare the performance of different recommendation algorithms -- may not be a good… Expand
On Offline Evaluation of Recommender Systems
TLDR
It is shown that accessing to different amount of future data may improve or deteriorate a model's recommendation accuracy, and that more historical data in training set does not necessarily lead to better recommendation accuracy. Expand
Differentiating Regularization Weights -- A Simple Mechanism to Alleviate Cold Start in Recommender Systems
TLDR
The proposed methodology on three baseline models -- SVD, SVD++, and the NMF models are applied and it is found that this technique improves the prediction accuracy for all these baseline models and better predicts the ratings on the long-tail items, i.e., the items that were rated/viewed/purchased by few users. Expand
Behavior2Vec: Generating Distributed Representations of Users' Behaviors on Products for Recommender Systems
TLDR
By leveraging on the cosine distance between the distributed representations of the behaviors on items under different contexts, a user’s next clicking or purchasing item more precisely is predicted more precisely, compared to several baseline methods. Expand
Empirically Testing Deep and Shallow Ranking Models for Click-Through Rate (CTR) prediction
TLDR
An error analysis is performed to investigate when the deep learning models perform better than simple models and when they do not, and it is found that recommendations based on a simple neighbor-based model, on average, outperform the results generated byDeep learning models based on two datasets from e-commerce websites. Expand
Personalized Travel Product Recommendation Based on Embedding of Multi-Behavior Interaction Network and Product Information Knowledge Graph
  • Li-Pin Xiao, Po-Ruey Lei, W. Peng
  • Computer Science
  • 2020 International Conference on Technologies and Applications of Artificial Intelligence (TAAI)
  • 2020
TLDR
A hybrid recommendation model is proposed to tackle two challenges in the recommendation system: the cold product issue and the skewed distribution problem, which takes the product information into consideration by using the metadata of products and extracting more features from the textual contents to form a knowledge graph. Expand
Online Indices for Predictive Top-k Entity and Aggregate Queries on Knowledge Graphs
TLDR
The notion of a virtual knowledge graph is defined which extends a knowledge graph with predicted edges and their probabilities which is one to two orders of magnitude faster in query processing than the closest previous work which can only handle one relationship type. Expand
Accelerating Matrix Factorization by Overparameterization
TLDR
It is confirmed that overparameterization can significantly accelerate the optimization of MF with no change in the expressiveness of the learning model, and is suggested to accelerate the training speed for the learning-based recommendation models whenever possible, especially when the size of the training. Expand
Mining the BoardGameGeek

References

SHOWING 1-10 OF 24 REFERENCES
Evaluating collaborative filtering recommender systems
TLDR
The key decisions in evaluating collaborative filtering recommender systems are reviewed: the user tasks being evaluated, the types of analysis and datasets being used, the ways in which prediction quality is measured, the evaluation of prediction attributes other than quality, and the user-based evaluation of the system as a whole. Expand
Solving the apparent diversity-accuracy dilemma of recommender systems
TLDR
This paper introduces a new algorithm specifically to address the challenge of diversity and shows how it can be used to resolve this apparent dilemma when combined in an elegant hybrid with an accuracy-focused algorithm. Expand
Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms
TLDR
This paper introduces a replay methodology for contextual bandit algorithm evaluation that is completely data-driven and very easy to adapt to different applications and can provide provably unbiased evaluations. Expand
Item-based top-N recommendation algorithms
TLDR
This article presents one class of model-based recommendation algorithms that first determines the similarities between the various items and then uses them to identify the set of items to be recommended, and shows that these item-based algorithms are up to two orders of magnitude faster than the traditional user-neighborhood based recommender systems and provide recommendations with comparable or better quality. Expand
Item popularity and recommendation accuracy
TLDR
A new accuracy measure is defined that has the desirable property of providing nearly unbiased estimates concerning recommendation accuracy and also motivates a refinement for training collaborative-filtering approaches. Expand
Counterfactual Estimation and Optimization of Click Metrics in Search Engines: A Case Study
TLDR
This paper proposes to address the problem of estimating online metrics that depend on user feedback using causal inference techniques, under the contextual-bandit framework, and obtains very promising results that suggest the wide applicability of these techniques. Expand
RecSys Challenge 2015 and the YOOCHOOSE Dataset
TLDR
The 2015 ACM Recommender Systems Challenge offered the opportunity to work on a large-scale e-commerce dataset from a big retailer in Europe which is accepting recommender system as a service from YOOCHOOSE, attracting 850 teams from 49 countries which submitted a total of 5,437 solutions. Expand
Diversifying search results
TLDR
This work proposes an algorithm that well approximates this objective in general, and is provably optimal for a natural special case, and generalizes several classical IR metrics, including NDCG, MRR, and MAP, to explicitly account for the value of diversification. Expand
Amazon.com Recommendations: Item-to-Item Collaborative Filtering
TLDR
This work compares three common approaches to solving the recommendation problem: traditional collaborative filtering, cluster models, and search-based methods, and their algorithm, which is called item-to-item collaborative filtering. Expand
Field-aware Factorization Machines for CTR Prediction
TLDR
This paper establishes FFMs as an effective method for classifying large sparse data including those from CTR prediction, and proposes efficient implementations for training FFMs and comprehensively analyze FFMs. Expand
...
1
2
3
...