• Publications
  • Influence
The Probabilistic Relevance Framework: BM25 and Beyond
The Probabilistic Relevance Framework (PRF) is a formal framework for document retrieval, grounded in work done in the 1970–1980s, which led to the development of one of the most successfulExpand
  • 955
  • 175
  • Open Access
Simple BM25 extension to multiple weighted fields
This paper describes a simple way of adapting the BM25 ranking formula to deal with structured documents. In the past it has been common to compute scores for the individual fields (e.g. title andExpand
  • 655
  • 93
  • Open Access
Relevance weighting for query independent evidence
A query independent feature, relating perhaps to document content, linkage or usage, can be transformed into a static, per-document relevance weight for use in ranking. The challenge is to find aExpand
  • 168
  • 20
  • Open Access
Microsoft Cambridge at TREC 13: Web and Hard Tracks
All our submissions from the Microsoft Research Cambridge (MSRC) team this year continue to explore issues in IR from a perspective very close to that of the original Okapi team, working first atExpand
  • 185
  • 17
  • Open Access
Learning to Rank Answers to Non-Factoid Questions from Web Collections
This work investigates the use of linguistically motivated features to improve search, in particular for ranking answers to non-factoid questions. We show that it is possible to exploit existingExpand
  • 153
  • 17
  • Open Access
The Perceptron Algorithm with Uneven Margins
The perceptron algorithm with margins is a simple, fast and effective learning algorithm for linear classifiers; it produces decision hyperplanes within some constant ratio of the maximal margin. InExpand
  • 159
  • 17
  • Open Access
Parsimonious language models for information retrieval
We systematically investigate a new approach to estimating the parameters of language models for information retrieval, called parsimonious language models. Parsimonious language models explicitlyExpand
  • 145
  • 15
  • Open Access
Ad-hoc object retrieval in the web of data
Semantic Search refers to a loose set of concepts, challenges and techniques having to do with harnessing the information of the growing Web of Data (WoD) for Web search. Here we propose a formalExpand
  • 214
  • 14
  • Open Access
Information Retrieval: Algorithms and Heuristics
  • H. Zaragoza
  • Computer Science
  • Information Retrieval
  • 1 April 2002
  • 209
  • 13
Learning to Rank Answers on Large Online QA Collections
This work describes an answer ranking engine for non-factoid questions built using a large online community-generated question-answer collection (Yahoo! Answers). We show how such collections may beExpand
  • 230
  • 12
  • Open Access