Maximizing Cumulative User Engagement in Sequential Recommendation: An Online Optimization Perspective

  title={Maximizing Cumulative User Engagement in Sequential Recommendation: An Online Optimization Perspective},
  author={Yifei Zhao and Yu-Hang Zhou and Mingdong Ou and Huan Xu and Nan Li},
  journal={Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery \& Data Mining},
  • Yifei ZhaoYu-Hang Zhou Nan Li
  • Published 2 June 2020
  • Computer Science
  • Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
To maximize cumulative user engagement (e.g. cumulative clicks) in sequential recommendation, it is often needed to tradeoff two potentially conflicting objectives, that is, pursuing higher immediate user engagement (e.g., click-through rate) and encouraging user browsing (i.e., more items exposured). Existing works often study these two tasks separately, thus tend to result in sub-optimal results. In this paper, we study this problem from an online optimization perspective, and propose a… 

Figures and Tables from this paper

ResAct: Reinforcing Long-term Engagement in Sequential Recommendation with Residual Actor

This paper proposes ResAct, a generative model which reconstructs behaviors of the online-serving policy by sampling multiple action estimators and designs an effective learning paradigm to train the residual actor which can output the residual for action improvement.

PrefRec: Preference-based Recommender Systems for Reinforcing Long-term User Engagement

This work proposes a novel paradigm, Pre ference-based Rec ommender systems (PrefRec), which allows RL recommender systems to learn from preferences about users’ historical behaviors rather than explicitly defined rewards, and designs an effective optimization method for PrefRec, which uses an additional value function, expectile regression and reward model pre-training to improve the performance.

Modeling Attrition in Recommender Systems with Departing Bandits

This work proposes a novel multi-armed bandit setup that captures such policy-dependent horizons as well as providing an efficient learning algorithm that achieves O(sqrt(T)ln(T)) regret, where T is the number of users.

Interpretable Attribute-based Action-aware Bandits for Within-Session Personalization in E-commerce

As the buyer continues on their shopping mission and interacts with different products in an online shop, OPAR learns which attributes the buyer likes and dislikes, forming an interpretable user preference profile and improving re-ranking performance over time, within the same session.

Exploit Customer Life-time Value with Memoryless Experiments

This work proposes a general LTV modeling method, which solves the problem that customers’ long-term contribution is difficult to quantify while existing methods, such as modeling the click-through rate, only pursue the short- term contribution.

User Response Prediction in Online Advertising

A taxonomy is proposed to categorize state-of-the-art user response prediction methods, primarily focusing on the current progress of machine learning methods used in different online platforms, and applications of user response Prediction, benchmark datasets, and open source codes in the field are reviewed.



Sequential Recommendation with User Memory Networks

A memory-augmented neural network (MANN) integrated with the insights of collaborative filtering for recommendation is designed, which store and update users» historical records explicitly, which enhances the expressiveness of the model.

Adaptive, Personalized Diversity for Visual Discovery

This work explores extensions in the direction of adaptive personalization and item diversification within Stream, a new form of visual browsing and discovery by Amazon, and presents the user with a diverse set of interesting items while adapting to user interactions.

Practical Lessons from Predicting Clicks on Ads at Facebook

This paper introduces a model which combines decision trees with logistic regression, outperforming either of these methods on its own by over 3%, an improvement with significant impact to the overall system performance.

Personalized Top-N Sequential Recommendation via Convolutional Sequence Embedding

A Convolutional Sequence Embedding Recommendation Model »Caser» is proposed, which is to embed a sequence of recent items into an image in the time and latent spaces and learn sequential patterns as local features of the image using convolutional filters.

Session-based Recommendations with Recurrent Neural Networks

It is argued that by modeling the whole session, more accurate recommendations can be provided by an RNN-based approach for session-based recommendations, and introduced several modifications to classic RNNs such as a ranking loss function that make it more viable for this specific problem.

Improving Sequential Recommendation with Knowledge-Enhanced Memory Networks

This paper proposes a novel knowledge enhanced sequential recommender that integrates the RNN-based networks with Key-Value Memory Network (KV-MN) and incorporates knowledge base information to enhance the semantic representation of KV- MN.

Recurrent Recommender Networks

Recurrent Recommender Networks (RRN) are proposed that are able to predict future behavioral trajectories by endowing both users and movies with a Long Short-Term Memory (LSTM) autoregressive model that captures dynamics, in addition to a more traditional low-rank factorization.

Collaborative Memory Network for Recommendation Systems

Collaborative Memory Networks is proposed, a deep architecture to unify the two classes of CF models capitalizing on the strengths of the global structure of latent factor model and local neighborhood-based structure in a nonlinear fashion.

Sequential User-based Recurrent Neural Network Recommendations

This paper extends Recurrent Neural Networks by considering unique characteristics of the Recommender Systems domain and shows how individual users can be represented in addition to sequences of consumed items in a new type of Gated Recurrent Unit to effectively produce personalized next item recommendations.

Heuristic Search for Generalized Stochastic Shortest Path MDPs

A new heuristic-search-based family of algorithms, FRET (Find, Revise, Eliminate Traps), is presented and a preliminary empirical evaluation shows that FRET solves GSSPs much more efficiently than Value Iteration.