• Corpus ID: 239049409

Locality-Sensitive Experience Replay for Online Recommendation

  title={Locality-Sensitive Experience Replay for Online Recommendation},
  author={Xiaocong Chen and Lina Yao and Xianzhi Wang and Julian McAuley},
Online recommendation requires handling rapidly changing user preferences. Deep reinforcement learning (DRL) is an effective means of capturing users’ dynamic interest during interactions with recommender systems. Generally, it is challenging to train a DRL agent, due to large state space (e.g., user-item rating matrix and user profiles), action space (e.g., candidate items), and sparse rewards. Existing studies leverage experience replay (ER) to let an agent learn from past experience. However… 

Figures and Tables from this paper


DRN: A Deep Reinforcement Learning Framework for News Recommendation
A Deep Q-Learning based recommendation framework, which can model future reward explicitly, is proposed, which considers user return pattern as a supplement to click / no click label in order to capture more user feedback information.
Large-scale Interactive Recommendation with Tree-structured Policy Gradient
  • Haokun Chen, Xinyi Dai, +5 authors Yong Yu
  • Computer Science, Mathematics
  • 2019
A Tree-structured Policy Gradient Recommendation (TPGR) framework, where a balanced hierarchical clustering tree is built over the items and picking an item is formulated as seeking a path from the root to a certain leaf of the tree.
RecoGym: A Reinforcement Learning Environment for the problem of Product Recommendation in Online Advertising
RecoGym is introduced, an RL environment for recommendation, which is defined by a model of user traffic patterns on e-commerce and the users response to recommendations on the publisher websites, that could open up an avenue of collaboration between the recommender systems and reinforcement learning communities and lead to better alignment between offline and online performance metrics.
Deep reinforcement learning for page-wise recommendations
A principled approach to jointly generate a set of complementary items and the corresponding strategy to display them in a 2-D page is proposed and a novel page-wise recommendation framework based on deep reinforcement learning, DeepPage, which can optimize a page of items with proper display based on real-time feedback from users is proposed.
Model-Based Reinforcement Learning with Adversarial Training for Online Recommendation
This work proposes a model-based reinforcement learning solution which models user-agent interaction for offline policy learning via a generative adversarial network and uses a discriminator to evaluate the quality of generated data and scale the generated rewards.
Policy Gradients for Contextual Recommendations
Policy Gradients for Contextual Recommendations (PGCR) is put forward to solve the problem without unrealistic assumptions of the problem, and optimizes over a restricted class of policies where the marginal probability of choosing an item has a simple closed form, and the gradient of the expected return over the policy in this class is in a succinct form.
Deep Learning based Recommender System: A Survey and New Perspectives
With the ever-growing volume of online information, recommender systems have been an e‚ective strategy to overcome such information overload. Œe utility of recommender systems cannot be overstated,
End-to-End Deep Reinforcement Learning based Recommendation with Supervised Embedding
The proposed EDRR effectively achieves the end-to-end training purpose for both policy-based and value-based RL models, and delivers better performance than state-of-the-art methods.
"Deep reinforcement learning for search, recommendation, and online advertising: a survey" by Xiangyu Zhao, Long Xia, Jiliang Tang, and Dawei Yin with Martin Vesely as coordinator
An overview of deep reinforcement learning for search, recommendation, and online advertising from methodologies to applications, review representative algorithms, and discuss some appealing research directions are given.
Stabilizing Reinforcement Learning in Dynamic Environment with Application to Online Recommendation
This paper proposes two techniques to alleviate the unstable reward estimation problem in dynamic environments, the stratified sampling replay strategy and the approximate regretted reward, which address the problem from the sample aspect and the reward aspect, respectively.