Reinforcement Learning to Optimize Long-term User Engagement in Recommender Systems

  title={Reinforcement Learning to Optimize Long-term User Engagement in Recommender Systems},
  author={Lixin Zou and Long Xia and Zhuoye Ding and Jiaxing Song and Weidong Liu and Dawei Yin},
  journal={Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery \& Data Mining},
  • Lixin ZouLong Xia Dawei Yin
  • Published 13 February 2019
  • Computer Science
  • Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
Recommender systems play a crucial role in our daily lives. [] Key Method FeedRec includes two components: 1)~a Q-Network which designed in hierarchical LSTM takes charge of modeling complex user behaviors, and 2)~a S-Network, which simulates the environment, assists the Q-Network and voids the instability of convergence in policy learning. Extensive experiments on synthetic data and a real-world large scale data show that FeedRec effectively optimizes the long-term user engagement and outperforms state-of…

Figures and Tables from this paper

Self-Supervised Reinforcement Learning for Recommender Systems

This paper proposes two frameworks namely Self-Supervised Q-learning (SQN) and Self-supervised Actor-Critic (SAC) and integrates the proposed frameworks with four state-of-the-art recommendation models.

Self-Supervised Reinforcement Learning for Recommender Systems

This paper proposes two frameworks namely Self-supervised Q-learning and Self-Supervised Actor-Critic and integrates the proposed frameworks with four state-of-the-art recommendation models, demonstrating the effectiveness of the approach on real-world datasets.

Self-Supervised Reinforcement Learning for Recommender Systems

The proposed self-supervised reinforcement learning for sequential recommendation tasks augments standard recommendation models with two output layers: one for selfsupervised learning and the other for RL, and integrates the proposed frameworks with four state-of-the-art recommendation models.

Rethinking Reinforcement Learning for Recommendation: A Prompt Perspective

This work proposes a new learning paradigm, Prompt-Based Reinforcement Learning (PRL), for the offline training of RL-based recommendation agents, and implements PRL with four notable recommendation models and conducts experiments on two real-world e-commerce datasets.

Reinforcement Learning based Recommender Systems: A Survey

A survey on reinforcement learning based recommender systems (RLRSs) is presented and it is recognized and illustrated that RLRSs can be generally classified into RL- and DRL-based methods and proposed an RLRS framework with four components, i.e., state representation, policy optimization, reward formulation, and environment building.

ResAct: Reinforcing Long-term Engagement in Sequential Recommendation with Residual Actor

This paper proposes ResAct, a generative model which reconstructs behaviors of the online-serving policy by sampling multiple action estimators and designs an effective learning paradigm to train the residual actor which can output the residual for action improvement.

Reinforcement Learning to Optimize Lifetime Value in Cold-Start Recommendation

This work model the recommender system as a Partially Observable and Controllable Markov Decision Process (POC-MDP), and proposes an actor-critic RL framework (RL-LTV) to incorporate the item lifetime values (LTV), which outperforms the strong live baseline.

Surrogate for Long-Term User Experience in Recommender Systems

A large-scale study of user behavior logs on one of the largest industrial recommendation platforms serving billions of users finds a subset of user behaviors that are predictive of users' increased visiting to the platform in $5$ months among the group of users with the same visiting frequency to begin with.

A Survey on Reinforcement Learning for Recommender Systems

A thorough overview, comparisons, and summarization of RL approaches applied in four typical recommender scenarios, including interactive recommendation, conversational recommendatin, sequential recommendation, and explainable recommendation is provided.

ACP based reinforcement learning for long-term recommender system

The theoretical analysis and the experiment illustrate that the ACP approach into the reinforcement learning based recommender system can better perform the recommendation than existing recommender systems.



Returning is Believing: Optimizing Long-term User Engagement in Recommender Systems

This work rigorously proves that with a high probability its proposed solution achieves a sublinear upper regret bound in maximizing cumulative clicks from a population of users in a given period of time, while a linear regret is inevitable if a user's temporal return behavior is not considered when making the recommendations.

Deep Reinforcement Learning for List-wise Recommendations

This paper proposes a novel recommender system with the capability of continuously improving its strategies during the interactions with users and introduces an online user-agent interacting environment simulator, which can pre-train and evaluate model parameters offline before applying the model online.

Recommendations with Negative Feedback via Pairwise Deep Reinforcement Learning

This paper model the sequential interactions between users and a recommender system as a Markov Decision Process (MDP) and leverage Reinforcement Learning (RL) to automatically learn the optimal strategies via recommending trial-and-error items and receiving reinforcements of these items from users' feedback.

Beyond clicks: dwell time for personalization

A novel method to compute accurate dwell time based on client-side and server-side logging is described and how to normalize dwell time across different devices and contexts is demonstrated.

Deep reinforcement learning for page-wise recommendations

A principled approach to jointly generate a set of complementary items and the corresponding strategy to display them in a 2-D page is proposed and a novel page-wise recommendation framework based on deep reinforcement learning, DeepPage, which can optimize a page of items with proper display based on real-time feedback from users is proposed.

Deep Reinforcement Learning for Search, Recommendation, and Online Advertising: A Survey

An overview of deep reinforcement learning for search, recommendation, and online advertising from methodologies to applications, review representative algorithms, and discuss some appealing research directions are given.

Session-based Recommendations with Recurrent Neural Networks

It is argued that by modeling the whole session, more accurate recommendations can be provided by an RNN-based approach for session-based recommendations, and introduced several modifications to classic RNNs such as a ranking loss function that make it more viable for this specific problem.

Online Context-Aware Recommendation with Time Varying Multi-Armed Bandit

A dynamical context drift model based on particle learning is proposed that is able to effectively capture the context change and learn the latent parameters of a contextual multi-armed bandit problem where the reward mapping function changes over time.

Improving recommender systems with adaptive conversational strategies

It is shown that the optimal strategy is different from the fixed one, and supports more effective and efficient interaction sessions, and allows conversational systems to autonomously improve a fixed strategy and eventually learn a better one using reinforcement learning techniques.

Micro Behaviors: A New Perspective in E-commerce Recommender Systems

The effects of micro behaviors on recommendations are uncovered and an interpretable Recommendation framework RIB is proposed, which models inherently the sequence of mIcro Behaviors and their effects.