• Publications
  • Influence
Term-Weighting Approaches in Automatic Text Retrieval
TLDR
The experimental evidence accumulated over the past 20 years indicates that textindexing systems based on the assignment of appropriately weighted single terms produce retrieval results that are superior to those obtainable with other more elaborate text representations. Expand
  • 8,442
  • 609
  • PDF
OHSUMED: an interactive retrieval evaluation and new large test collection for research
TLDR
A series of information retrieval experiments was carried out with a computer installed in a medical practice setting for relatively inexperienced physician end-users. Expand
  • 880
  • 83
  • PDF
Retrieval evaluation with incomplete information
TLDR
This paper examines whether Cranfield evaluation methodology is robust to gross violations of the completeness assumption (i.e., the assumption that all relevant documents within a test collection have been identified and are present in the collection). Expand
  • 690
  • 69
  • PDF
Pivoted document length normalization
TLDR
We present pivoted normalization, a technique that can be used to modify any normalization function thereby reducing the gap between the relevance and the retrieval probabilities. Expand
  • 470
  • 39
  • PDF
Implementation of the SMART Information Retrieval System
  • 362
  • 33
Improving automatic query expansion
TLDR
We investigate ways to improve the query expansion process by refining the set of documents used in feedback by using Boolean filters along with proximity constraints. Expand
  • 660
  • 32
  • PDF
Automatic Query Expansion Using SMART: TREC 3
TLDR
The Smart information retrieval project emphasizes completely automatic approaches to the understanding and retrieval of large quantities of text. Expand
  • 608
  • 30
  • PDF
The effect of topic set size on retrieval experiment error
TLDR
This paper uses TREC results to empirically derive error rates based on the number of topics used in a test and the observed difference in the average scores. Expand
  • 246
  • 26
Improving retrieval performance by relevance feedback
TLDR
Relevance feedback is an automatic process, introduced over 20 years ago, designed to produce improved query formulations following an initial retrieval operation. Expand
  • 642
  • 25
  • PDF
Automatic Text Structuring and Summarization
TLDR
This study applies the ideas from the automatic link generation research to attack another important problem in text processing—automatic text summarization by passage extraction. Expand
  • 525
  • 25