Exploiting Wikipedia as external knowledge for document clustering


In traditional text clustering methods, documents are represented as "bags of words" without considering the semantic information of each document. For instance, if two documents use different collections of core words to represent the same topic, they may be falsely assigned to different clusters due to the lack of shared core words, although the core… (More)
DOI: 10.1145/1557019.1557066


6 Figures and Tables


Citations per Year

266 Citations

Semantic Scholar estimates that this publication has 266 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@inproceedings{Hu2009ExploitingWA, title={Exploiting Wikipedia as external knowledge for document clustering}, author={Xiaohua Hu and Xiaodan Zhang and Caimei Lu and E. K. Park and Xiaohua Zhou}, booktitle={KDD}, year={2009} }