A contextual-bandit approach to personalized news article recommendation

@article{Li2010ACA,
  title={A contextual-bandit approach to personalized news article recommendation},
  author={Lihong Li and Wei Chu and John Langford and Robert E. Schapire},
  journal={ArXiv},
  year={2010},
  volume={abs/1003.0146}
}
Personalized web services strive to adapt their services (advertisements, news articles, etc.) to individual users by making use of both content and user information. [...] Key Method Second, we argue that any bandit algorithm can be reliably evaluated offline using previously recorded random traffic. Finally, using this offline evaluation method, we successfully applied our new algorithm to a Yahoo! Front Page Today Module dataset containing over 33 million events. Results showed a 12.5% click lift compared to…Expand
Personalized Recommendation via Parameter-Free Contextual Bandits
TLDR
This work proposes a parameter-free bandit strategy, which employs a principled resampling approach called online bootstrap, to derive the distribution of estimated models in an online manner and demonstrates the effectiveness of the proposed algorithm in terms of the click-through rate. Expand
Contextual Bandit Approach-based Recommendation System for Personalized Web-based Services
TLDR
The experiment results show that CoLin outperforms Hybrid-LinUBC and LinUCB, reporting cumulated regret of 8.950 for LastFm and 60.34 for MovieLens20M and 34.10 for Yahoo Front Page Today Module. Expand
Ensemble contextual bandits for personalized recommendation
TLDR
A meta-bandit paradigm is employed that places a hyper bandit over the base bandits, to explicitly explore/exploit the relative importance of base bandits based on user feedbacks to obtain robust predicted click-through rate (CTR) of web objects. Expand
A Contextual Bandit Approach to Personalized Online Recommendation via Sparse Interactions
TLDR
This paper proposes a novel approach, named SAOR, to make online recommendations via sparse interactions that uses positive and negative responses to build the user preference model, ignoring all non-responses. Expand
Personalized Recommendation via Parameter-Free
TLDR
This paper formulate personalized recommendation as a contextual bandit problem to solve the exploration/exploitation dilemma and proposes a parameter-free bandit strategy, which employs a principled resampling approach called online bootstrap, to derive the distribution of estimated models in an online manner. Expand
Adaptive Linear Contextual Bandits for Online Recommendations
  • 2017
Contextual bandit algorithms have been successfully applied to online recommender systems by dynamically optimizing a tradeo between exploration and exploitation. However, most existing approachesExpand
Data-driven evaluation of Contextual Bandit algorithms and applications to Dynamic Recommendation
TLDR
It is shown that a bootstrap-based approach allows to significantly reduce this bias and more importantly to control it and is commented on on the result of an experiment of unprecedented scale: a public challenge. Expand
Exploiting search history of users for news personalization
TLDR
This paper proposes a novel approach that relies on the concept of search profiles, which are user profiles that are built based on the past interactions of the user with a web search engine, and extensively test the proposal on real-world datasets obtained from Yahoo. Expand
Collaborative Filtering Bandits
TLDR
This work investigates an adaptive clustering technique for content recommendation based on exploration-exploitation strategies in contextual multi-armed bandit settings, showing scalability and increased prediction performance over state-of-the-art methods for clustering bandits on medium-size real-world datasets. Expand
Stochastic Models to Improve E-News Recommender Systems
TLDR
First results demonstrate that models who use only information from the recent past are the best, and whether these models are best, varying data contexts, and how to generate more personalized models are looked at. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 101 REFERENCES
Personalized recommendation on dynamic content using predictive bilinear models
TLDR
This work proposes a feature-based machine learning approach to personalized recommendation that is capable of handling the cold-start issue effectively and results in an offline model with light computational overhead compared with other recommender systems that require online re-training. Expand
Naïve filterbots for robust cold-start recommendations
TLDR
This work improves the scalability and performance of a previous approach to handling cold-start situations that uses filterbots, or surrogate users that rate items based only on user or item attributes, and shows that introducing a very small number of simple filterbots helps make CF algorithms more robust. Expand
Google news personalization: scalable online collaborative filtering
TLDR
This paper describes the approach to collaborative filtering for generating personalized recommendations for users of Google News using MinHash clustering, Probabilistic Latent Semantic Indexing, and covisitation counts, and combines recommendations from different algorithms using a linear model. Expand
Just-in-time contextual advertising
TLDR
Empirical evaluation proves that matching ads on the basis of a carefully selected 5% fraction of the page text sacrifices only 1%-3% in ad relevance, and is competitive with matching based on the entire page content. Expand
Online Models for Content Optimization
TLDR
A new content publishing system that selects articles to serve to a user, choosing from an editorially programmed pool that is frequently refreshed, is described and deployed on a major Yahoo! portal, and significantly increases the number of user clicks over the original manual approach. Expand
Explore/Exploit Schemes for Web Content Optimization
TLDR
A Bayesian solution to find the optimal trade-off between explore and exploit for web content publishing applications where dynamic set of items with short lifetimes, delayed feedback and non-stationary reward distributions are typical is developed. Expand
The Adaptive Web, Methods and Strategies of Web Personalization
TLDR
This paper presents a meta-modelling architecture for the adaptive web that automates the very labor-intensive and therefore time-heavy and expensive process of manually cataloging and cataloging content on the web. Expand
Efficient bandit algorithms for online multiclass prediction
TLDR
The Banditron has the ability to learn in a multiclass classification setting with the "bandit" feedback which only reveals whether or not the prediction made by the algorithm was correct or not (but does not necessarily reveal the true label). Expand
A case study of behavior-driven conjoint analysis on Yahoo!: front page today module
TLDR
A successful large-scale case study of conjoint analysis on click through stream in a real-world application at Yahoo!, considers identifying users' heterogenous preferences from millions of click/view events and building predictive models to classify new users into segments of distinct behavior pattern. Expand
SAMPLE MEAN BASED INDEX POLICIES WITH O(logn) REGRET FOR THE MULTI-ARMED BANDIT PROBLEM
We consider a non-Bayesian infinite horizon version of the multi-armed bandit problem with the objective of designing simple policies whose regret increases sldwly with time. In their seminal work onExpand
...
1
2
3
4
5
...