Learning Neural Ranking Models Online from Implicit User Feedback

@article{Jia2022LearningNR,
  title={Learning Neural Ranking Models Online from Implicit User Feedback},
  author={Yiling Jia and Hongning Wang},
  journal={Proceedings of the ACM Web Conference 2022},
  year={2022}
}
Existing online learning to rank (OL2R) solutions are limited to linear models, which are incompetent to capture possible non-linear relations between queries and documents. In this work, to unleash the power of representation learning in OL2R, we propose to directly learn a neural ranking model from users’ implicit feedback (e.g., clicks) collected on the fly. We focus on RankNet and LambdaRank, due to their great empirical success and wide adoption in offline settings, and control the… 

Figures and Tables from this paper

Scalable Exploration for Neural Online Learning to Rank with Perturbed Feedback

This work proposes an efficient exploration strategy for online interactive neural ranker learning based on bootstrapping that eliminates explicit confidence set construction and the associated computational overhead, which enables the online neural rankers training to be efficiently executed in practice with theoretical guarantees.

References

SHOWING 1-10 OF 64 REFERENCES

PairRank: Online Pairwise Learning to Rank by Divide-and-Conquer

Regret directly defined on the number of mis-ordered pairs is proven, which connects the online solution’s theoretical convergence with its expected ranking performance.

Balancing Speed and Quality in Online Learning to Rank for Information Retrieval

A fast OLTR model called Sim-MGD is introduced that addresses the speed aspect of the speed-quality tradeoff and Cascading Multileave Gradient De- scent is contributed for OLTR that directly addresses theSpeed- quality tradeoff.

Differentiable Unbiased Online Learning to Rank

Paired Differentiable Gradient Descent is an efficient and unbiased OLTR approach that provides a better user experience than previously possible and shows that using a neural network leads to even better performance at convergence than a linear model.

Efficient Exploration of Gradient Space for Online Learning to Rank

The proposed algorithm, named as Null Space Gradient Descent, reduces the exploration space to only the null space of recent poorly performing gradients, preventing the algorithm from repeatedly exploring directions that have been discouraged by the most recent interactions with users.

Variance Reduction in Gradient Exploration for Online Learning to Rank

This work projects the selected updating direction into a space spanned by the feature vectors from examined documents under the current query, after an interleaved test, and proves that this projected gradient is still an unbiased estimation of the true gradient, and shows that this lower-variance gradient estimation results in significant regret reduction.

Online Learning to Rank in Stochastic Click Models

BatchRank is proposed, the first online learning to rank algorithm for a broad class of click models that encompasses two most fundamental click models, the cascade and position-based models, and is observed to outperforms ranked bandits and is more robust than CascadeKL-UCB, an existing algorithm for the cascade model.

Multileave Gradient Descent for Fast Online Learning to Rank

An online learning to rank algorithm called multileave gradient descent (MGD) is proposed that extends DBGD to learn from so-called multileaved comparison methods that can compare a set of rankings instead of merely a pair.

Balancing exploration and exploitation in listwise and pairwise online learning to rank for information retrieval

The results show that balancing exploration and exploitation can substantially and significantly improve the online retrieval performance of both listwise and pairwise approaches.

Constructing Reliable Gradient Exploration for Online Learning to Rank

Two OLR algorithms that improve the reliability of the exploration by constructing robust exploratory directions and a Multi-Point Deterministic Gradient Descent method that constructs a set of deterministic standard unit basis vectors for exploration are proposed.

TopRank: A practical algorithm for online stochastic ranking

This work proposes a generalized click model that encompasses many existing models, including the position-based and cascade models, and motivates a novel online learning algorithm based on topological sort, which is called TopRank.
...