# Noise tolerance of learning to rank under class-conditional label noise

@article{Haddad2022NoiseTO, title={Noise tolerance of learning to rank under class-conditional label noise}, author={Dany Haddad}, journal={ArXiv}, year={2022}, volume={abs/2208.02126} }

Often, the data used to train ranking models is subject to label noise. For example, in web-search, labels created from clickstream data are noisy due to issues such as insufficient information in item descriptions on the SERP, query reformulation by the user, and erratic or unexpected user behavior. In practice, it is difficult to handle label noise without making strong assumptions about the label generation process. As a result, practitioners typically train their learning-to-rank (LtR…

## References

SHOWING 1-10 OF 18 REFERENCES

On the Theory of Weak Supervision for Information Retrieval

- Computer ScienceICTIR
- 2018

It is proved that given some sufficient constraints on the loss function, weak supervision is equivalent to supervised learning under uniform noise, and an upper bound for the empirical risk of weak supervision in case of non-uniform noise is found.

The landscape of empirical risk for nonconvex losses

- Computer Science, MathematicsThe Annals of Statistics
- 2018

It is demonstrated that in several problems such as non-convex binary classification, robust regression, and Gaussian mixture model, this result implies a complete characterization of the landscape of the empirical risk, and of the convergence properties of descent algorithms.

Learn then Test: Calibrating Predictive Algorithms to Achieve Risk Control

- Computer ScienceArXiv
- 2021

The main insight is to reframe the risk-control problem as multiple hypothesis testing, enabling techniques and mathematical arguments to be reframe from those in the previous literature.

Learning from Noisy Labels with No Change to the Training Process

- Computer ScienceICML
- 2021

A quantitative regret transfer bound is provided, which bounds the target regret on the true distribution in terms of the CPE regrets on the noisy distribution, and suggests that the sample complexity of learning under CCN increases as the noise matrix approaches singularity.

Distribution-Free, Risk-Controlling Prediction Sets

- Computer ScienceJ. ACM
- 2021

This work shows how to generate set-valued predictions from a black-box predictor that controls the expected loss on future test points at a user-specified level, and provides explicit finite-sample guarantees for any dataset by using a holdout set to calibrate the size of the prediction sets.

Probabilistic forecasts, calibration and sharpness

- Environmental Science
- 2007

Summary. Probabilistic forecasts of continuous variables take the form of predictive densities or predictive cumulative distribution functions. We propose a diagnostic approach to the evaluation of…

High-Dimensional Statistics

- Computer Science
- 2014

The focus is on the review and comments of his six recent papers in four areas, but only three of them are reproduced here due to limit of the space.

Cost-Sensitive Learning with Noisy Labels

- Computer ScienceJ. Mach. Learn. Res.
- 2017

The proposed methods are competitive with respect to recently proposed methods for dealing with label noise in several benchmark data sets and imply that methods already used in practice, such as biased SVM and weighted logistic regression, are provably noise-tolerant.

A Distribution-Free Theory of Nonparametric Regression

- MathematicsSpringer series in statistics
- 2002

Why is Nonparametric Regression Important? * How to Construct Nonparametric Regression Estimates * Lower Bounds * Partitioning Estimates * Kernel Estimates * k-NN Estimates * Splitting the Sample *…

Learning to Rank with Nonsmooth Cost Functions

- Computer ScienceNIPS
- 2006

A class of simple, flexible algorithms, called LambdaRank, which avoids difficulties by working with implicit cost functions by using neural network models, and can be extended to any non-smooth and multivariate cost functions.