Learning to Rank for Information Retrieval

@inproceedings{Liu2011LearningTR,
  title={Learning to Rank for Information Retrieval},
  author={Tie-Yan Liu},
  year={2011}
}
The usual approach to optimisation, of ranking algorithms for search and in many other contexts, is to obtain some training set of labeled data and optimise the algorithm on this training set, then apply the resulting model (with the chosen optimal parameter set) to the live environment. (There may be an intermediate test stage, but this does not affect the present argument.) This approach involves the choice of a metric, in this context normally some particular IR effectiveness metric. It is… Expand
Probabilistic Models over Ordered Partitions with Applications in Document Ranking and Collaborative Filtering
TLDR
A probabilistic generative model is proposed, that models the process of ranking documents in super-exponential combinatorial state space with unknown numbers of partitions and unknown ordering among them, and it is shown that with suitable parameterisation, the models can be learned in linear time. Expand
Scalability and Performance of Random Forest based Learning-to-Rank for Information Retrieval
TLDR
This research investigates the random forest based LtR algorithms and develops methods for estimating the bias and variance of rank-learning algorithms, and examines their empirical behavior against parameters of the learning algorithm. Expand
Modelling human preferences for ranking and collaborative filtering: a probabilistic ordered partition approach
TLDR
This paper proposes a novel approach by constructing probabilistic models directly on the collection of objects exploiting the combinatorial structure induced by the ties among them, and demonstrates that the models are competitive against state-of-the-arts. Expand
The whens and hows of learning to rank for web search
TLDR
The comprehensive experiments provide the first empirical derivation of best practices for learning to rank deployments, finding that the smallest effective sample for a given query set is dependent on the type of information need of the queries, the document representation used during sampling and the test evaluation measure. Expand
Learning to Rank under Multiple Annotators
TLDR
The results reveal that the maximum likelihood approach outperforms the first approach significantly and is comparable of achieving results with the learning model considering reliable labels when each training instance is labeled by multiple annotators that may be unreliable. Expand
Learning to Rank Using Markov Random Fields
TLDR
A novel approach to learning-to-rank, which can natively integrate any target metric with no modifications, and employs the pseudo-likelihood as an accurate surrogate of the likelihood to avoid to explicitly compute the normalization factor of the Boltzmann distribution. Expand
Learning to rank for information retrieval
TLDR
Three major approaches to learning to rank are introduced, i.e., the pointwise, pairwise, and listwise approaches, the relationship between the loss functions used in these approaches and the widely-used IR evaluation measures are analyzed, and the performance of these approaches on the LETOR benchmark datasets is evaluated. Expand
Competence-Conscious Associative Rank Aggregation
TLDR
This paper investigates learning to rank methods that uncover, from the training data, associations between document features and relevance levels in order to estimate the relevance of documents with regard to a given query and proposes a new aggregation paradigm competence-conscious associative rank aggregation. Expand
Undersampling Techniques to Re-balance Training Data for Large Scale Learning-to-Rank
TLDR
This study investigates the imbalanced nature of LtR training sets, and suggests that for large scale LtR tasks, undersampling techniques can be leveraged to reduce training time with negligible effect on performance. Expand
Plackett-Luce Regression Mixture Model for Heterogeneous Rankings
TLDR
A probabilistic graphical model called Plackett-Luce Regression Mixture or PLRM is developed, and its inference via Expectation-Maximization algorithm is described, which showcases the effectiveness of PLRM as opposed to a pipelined approach of clustering followed by learning to rank. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 152 REFERENCES
AdaRank: a boosting algorithm for information retrieval
TLDR
The proposed novel learning algorithm, referred to as AdaRank, repeatedly constructs 'weak rankers' on the basis of reweighted training data and finally linearly combines the weak rankers for making ranking predictions, which proves that the training process of AdaRank is exactly that of enhancing the performance measure used. Expand
LETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval
TLDR
This paper has constructed a benchmark dataset referred to as LETOR, derived the LETOR data from the existing data sets widely used in IR, namely, OHSUMED and TREC data and provided the results of several state-ofthe-arts learning to rank algorithms on the data. Expand
An Efficient Boosting Algorithm for Combining Preferences
TLDR
This work describes and analyze an efficient algorithm called RankBoost for combining preferences based on the boosting approach to machine learning, and gives theoretical results describing the algorithm's behavior both on the training data, and on new test data not seen during training. Expand
Direct Maximization of Rank-Based Metrics for Information Retrieval
Ranking is an essential component for a number of tasks, such as information retrieval and collaborative filtering. It is often the case that the underlying task attempts to maximize some evaluationExpand
Learning to rank for information retrieval
TLDR
Three major approaches to learning to rank are introduced, i.e., the pointwise, pairwise, and listwise approaches, the relationship between the loss functions used in these approaches and the widely-used IR evaluation measures are analyzed, and the performance of these approaches on the LETOR benchmark datasets is evaluated. Expand
Learning to rank with SoftRank and Gaussian processes
TLDR
The SoftRank framework is extended to make use of the score uncertainties which are naturally provided by a Gaussian process (GP), which is a probabilistic non-linear regression model, which gives improved performance and efficiency. Expand
Learning to Order Things
TLDR
An on-line algorithm for learning preference functions that is based on Freund and Schapire's "Hedge" algorithm is considered, and it is shown that the problem of finding the ordering that agrees best with a learned preference function is NP-complete. Expand
Ranking refinement and its application to information retrieval
TLDR
This work presents a novel boosting framework for ranking refinement that can effectively leverage the uses of the two sources of information and significantly outperforms the baseline algorithms that incorporate the outputs from the base ranker as an additional feature. Expand
Adapting ranking SVM to document retrieval
TLDR
Experimental results show that the modifications made in conventional Ranking SVM can outperform the conventional ranking SVM and other existing methods for document retrieval on two datasets and employ two methods to conduct optimization on the loss function: gradient descent and quadratic programming. Expand
An Unsupervised Learning Algorithm for Rank Aggregation
TLDR
This work presents a novel unsupervisedlearning algorithm for rank aggregation (ULARA) which returns a linear combination of the individual ranking functions based on the principle of rewarding ordering agreement between the rankers. Expand
...
1
2
3
4
5
...