Learning to classify short and sparse text & web with hidden topics from large-scale data collections

@inproceedings{Phan2008LearningTC,
  title={Learning to classify short and sparse text & web with hidden topics from large-scale data collections},
  author={Xuan Hieu Phan and Minh Le Nguyen and Susumu Horiguchi},
  booktitle={WWW},
  year={2008}
}
This paper presents a general framework for building classifiers that deal with short and sparse text & Web segments by making the most of hidden topics discovered from large-scale data collections. The main motivation of this work is that many classification tasks working with short segments of text & Web, such as search snippets, forum & chat messages, blog & news feeds, product reviews, and book & movie summaries, fail to achieve high accuracy due to the data sparseness. We, therefore, come… CONTINUE READING
Highly Influential
This paper has highly influenced 46 other papers. REVIEW HIGHLY INFLUENTIAL CITATIONS
Highly Cited
This paper has 594 citations. REVIEW CITATIONS

11 Figures & Tables

Topics

Statistics

05010020082009201020112012201320142015201620172018
Citations per Year

595 Citations

Semantic Scholar estimates that this publication has 595 citations based on the available data.

See our FAQ for additional information.