Less is more: Eliminating index terms from subordinate clauses

@inproceedings{CorstonOliver1999LessIM,
  title={Less is more: Eliminating index terms from subordinate clauses},
  author={Simon Corston-Oliver and William B. Dolan},
  booktitle={ACL},
  year={1999}
}
We perform a linguistic analysis of documents during indexing for information retrieval. By eliminating index terms that occur only in subordinate clauses, index size is reduced by approximately 30% without adversely affecting precision or recall. These results hold for two corpora: a sample of the world wide web and an electronic encyclopedia.