Improving Information Retrieval Using Document Clusters and Semantic Synonym Extraction

Abstract

Document clustering has been investigated for use in a number of different areas of text mining and information retrieval. Initially, document clustering was investigated for improving the precision or recall in information retrieval systems and as an efficient way of finding the nearest neighbors of a document. More recently, clustering has been proposed for use in browsing a collection of documents or in organizing the results returned by a search engine in response to a user’s query. This paper presents a new semantic synonym based correlation indexing method in which documents are clustered based on nearest neighbors from the document collection and then further refined by semantically relating the query term with the retrieved documents by making use of a thesaurus or ontology model to improve the performance of Information Retrieval System (IRS) by increasing the number of relevant documents retrieved. Results show that the proposed method achieves significant improvement than the existing methods and may generate the more relevant document in the top rank.

3 Figures and Tables

Cite this paper

@inproceedings{Bharathi2012ImprovingIR, title={Improving Information Retrieval Using Document Clusters and Semantic Synonym Extraction}, author={Ganapathi Bharathi and D . VENKATESAN}, year={2012} }