Exploiting Wikipedia as external knowledge for document clustering

Abstract

In traditional text clustering methods, documents are represented as "bags of words" without considering the semantic information of each document. For instance, if two documents use different collections of core words to represent the same topic, they may be falsely assigned to different clusters due to the lack of shared core words, although the core… (More)
DOI: 10.1145/1557019.1557066

Topics

6 Figures and Tables

Statistics

020402009201020112012201320142015201620172018
Citations per Year

266 Citations

Semantic Scholar estimates that this publication has 266 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@inproceedings{Hu2009ExploitingWA, title={Exploiting Wikipedia as external knowledge for document clustering}, author={Xiaohua Hu and Xiaodan Zhang and Caimei Lu and E. K. Park and Xiaohua Zhou}, booktitle={KDD}, year={2009} }