LETOR: A benchmark collection for research on learning to rank for information retrieval

Abstract

LETOR is a benchmark collection for the research on learning to rank for information retrieval, released by Microsoft Research Asia. In this paper, we describe the details of the LETOR collection and show how it can be used in different kinds of researches. Specifically, we describe how the document corpora and query sets in LETOR are selected, how the documents are sampled, how the learning features and meta information are extracted, and how the datasets are partitioned for comprehensive evaluation. We then compare several state-of-the-art learning to rank algorithms on LETOR, report their ranking performances, and make discussions on the results. After that, we discuss possible new research topics that can be supported by LETOR, in addition to algorithm comparison. We hope that this paper can help people to gain deeper understanding of LETOR, and enable more interesting research projects on learning to rank and related topics.

DOI: 10.1007/s10791-009-9123-y

Extracted Key Phrases

21 Figures and Tables

0204020102011201220132014201520162017
Citations per Year

267 Citations

Semantic Scholar estimates that this publication has 267 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@article{Qin2009LETORAB, title={LETOR: A benchmark collection for research on learning to rank for information retrieval}, author={Tao Qin and Tie-Yan Liu and Jun Xu and Hang Li}, journal={Information Retrieval}, year={2009}, volume={13}, pages={346-374} }