Adapting boosting for information retrieval measures
@article{Wu2010AdaptingBF, title={Adapting boosting for information retrieval measures}, author={Qiang Wu and Christopher J. C. Burges and Krysta Marie Svore and Jianfeng Gao}, journal={Information Retrieval}, year={2010}, volume={13}, pages={254-270} }
We present a new ranking algorithm that combines the strengths of two previous methods: boosted tree classification, and LambdaRank, which has been shown to be empirically optimal for a widely used information retrieval measure. [] Key Method We also show how to find the optimal linear combination for any two rankers, and we use this method to solve the line search problem exactly during boosting. In addition, we show that starting with a previously trained model, and boosting using its residuals, furnishes…
514 Citations
Scalability and Performance of Random Forest based Learning-to-Rank for Information Retrieval
- Computer ScienceSIGIR Forum
- 2017
This research investigates the random forest based LtR algorithms and develops methods for estimating the bias and variance of rank-learning algorithms, and examines their empirical behavior against parameters of the learning algorithm.
Learning to Rank on a Cluster using Boosted Decision Trees
- Computer Science
- 2010
This work investigates the problem of learning to rank on a cluster of Web search data composed of 140,000 queries and approximately fourteen mil lion URLs, and a boosted tree ranking algorithm called LambdaMART, and implements a method for improving the speed of training when the training data fits in main memory on a single machine.
BoostingTree: parallel selection of weak learners in boosting, with application to ranking
- Computer ScienceMachine Learning
- 2013
A strategy that builds several sequences of weak hypotheses in parallel in parallel, and extends the ones that are likely to yield a good model and converges to similar performance as the original boosting algorithms otherwise is proposed.
A Robust Ranking Methodology Based on Diverse Calibration of AdaBoost
- Computer ScienceECML/PKDD
- 2011
This paper introduces a learning to rank approach to subset ranking based on multi-class classification that outperformed many standard ranking algorithms on the LETOR benchmark datasets and is therefore less prone to overfitting.
Active Learning to Rank Method for Documents Retrieval
- Computer Science
- 2015
A new active learning to rank algorithm based on boosting for active ranking functions to introduce unlabeled data in the learning process and to evaluate the performance of pairwise and listwise approaches.
Distributed Machine Learning
- Computer Science
- 2011
This work investigates the problem of learning to rank on a cluster using Web search data composed of 140,000 queries and approximately fourteen million URLs, and implements a method for improving the speed of training when the training data fits in main memory on a single machine by distributing the vertex split computations of the decision trees.
Ranking function adaptation with boosting trees
- Computer ScienceTOIS
- 2011
A new approach called tree-based ranking function adaptation (Trada) is proposed to effectively utilize data sources for training cross-domain ranking functions and is extended to utilize the pairwise preference data from the target domain to further improve the effectiveness of adaptation.
Two-Stage Learning to Rank for Information Retrieval
- Computer ScienceECIR
- 2013
Empirical evaluation using two web collections unequivocally demonstrates that the proposed two-stage framework, being able to learn its model from more relevant documents, outperforms current learning to rank approaches.
Learning to rank, a supervised approach for ranking of documents
- Computer Science
- 2015
This thesis investigates state-of-the-art machine learning methods for ranking known as learning to rank to explore if it can be used in enterprise search, which means less data and less document features than web based search.
Direct optimization of ranking measures for learning to rank models
- Computer ScienceKDD
- 2013
A novel learning algorithm is presented, DirectRank, which directly and exactly optimizes ranking measures without resorting to any upper bounds or approximations, and a probabilistic framework for document-query pairs is constructed to maximize the likelihood of the objective permutation of top-$\tau$ documents.
References
SHOWING 1-10 OF 39 REFERENCES
A General Boosting Method and its Application to Learning Ranking Functions for Web Search
- Computer ScienceNIPS
- 2007
We present a general boosting method extending functional gradient boosting to optimize complex loss functions that are encountered in many machine learning problems. Our approach is based on…
On the local optimality of LambdaRank
- Computer ScienceSIGIR
- 2009
It is shown that LambdaRank, which smoothly approximates the gradient of the target measure, can be adapted to work with four popular IR target evaluation measures using the same underlying gradient construction.
Model Adaptation via Model Interpolation and Boosting for Web Search Ranking
- Computer ScienceEMNLP
- 2009
This paper explores two classes of model adaptation methods for Web search ranking: Model Interpolation and error-driven learning approaches based on a boosting algorithm. The results show that model…
Learning to Rank Using Classification and Gradient Boosting
- Computer Science
- 2008
This work considers the DCG criterion (discounted cumulative gain), a standard quality measure in information retrieval, and proposes using the Expected Relevance to convert the class probabilities into ranking scores.
McRank: Learning to Rank Using Multiple Classification and Gradient Boosting
- Computer ScienceNIPS
- 2007
This work considers the DCG criterion (discounted cumulative gain), a standard quality measure in information retrieval, and proposes using the Expected Relevance to convert class probabilities into ranking scores.
Linear discriminant model for information retrieval
- Computer ScienceSIGIR '05
- 2005
Results show that in most test sets, LDM significantly outperforms the state-of-the-art language modeling approaches and the classical probabilistic retrieval model and it is more appropriate to train LDM using a measure of AP rather than likelihood if the IR system is graded on AP.
A support vector method for optimizing average precision
- Computer ScienceSIGIR
- 2007
This work presents a general SVM learning algorithm that efficiently finds a globally optimal solution to a straightforward relaxation of MAP, and shows its method to produce statistically significant improvements in MAP scores.
Trada: tree based ranking function adaptation
- Computer ScienceCIKM '08
- 2008
Tree adaptation assumes that ranking functions are trained with regression-tree based modeling methods, such as Gradient Boosting Trees, and takes such a ranking function from one domain and tunes its tree-based structure with a small amount of training data from the target domain.
Learning to rank: from pairwise approach to listwise approach
- Computer ScienceICML '07
- 2007
It is proposed that learning to rank should adopt the listwise approach in which lists of objects are used as 'instances' in learning, and introduces two probability models, respectively referred to as permutation probability and top k probability, to define a listwise loss function for learning.
A general language model for information retrieval
- Computer ScienceCIKM '99
- 1999
A new language model for information retrieval is presented, which is based on a range of data smoothing techniques, including the Good-Turning estimate, curve-fitting functions, and model combinations, and can be easily extended to incorporate probabilities of phrases such as word pairs and word triples.