Exploiting Wikipedia as external knowledge for document clustering


In traditional text clustering methods, documents are represented as "bags of words" without considering the semantic information of each document. For instance, if two documents use different collections of core words to represent the same topic, they may be falsely assigned to different clusters due to the lack of shared core words, although the core… (More)
DOI: 10.1145/1557019.1557066


@inproceedings{Hu2009ExploitingWA, title={Exploiting Wikipedia as external knowledge for document clustering}, author={Xiaohua Hu and Xiaodan Zhang and Caimei Lu and E. K. Park and Xiaohua Zhou}, booktitle={KDD}, year={2009} }