Short and Sparse Text Topic Modeling via Self-Aggregation

  title={Short and Sparse Text Topic Modeling via Self-Aggregation},
  author={Xiaojun Quan and Chunyu Kit and Yong Ge and Sinno Jialin Pan},
The overwhelming amount of short text data on social media and elsewhere has posed great challenges to topic modeling due to the sparsity problem. Most existing attempts to alleviate this problem resort to heuristic strategies to aggregate short texts into pseudo-documents before the application of standard topic modeling. Although such strategies cannot be well generalized to more general genres of short texts, the success has shed light on how to develop a generalized solution. In this paper… CONTINUE READING
Highly Cited
This paper has 54 citations. REVIEW CITATIONS


Publications citing this paper.
Showing 1-10 of 30 extracted citations

54 Citations

Citations per Year
Semantic Scholar estimates that this publication has 54 citations based on the available data.

See our FAQ for additional information.


Publications referenced by this paper.
Showing 1-10 of 26 references

Machine learning

  • Kamal Nigam, Andrew Kachites McCallum, Sebastian Thrun, Tom Mitchell. Text classification from labeled, unlabeled documents using em
  • 39(2-3):103–134,
  • 2000
Highly Influential
5 Excerpts

pages 338– 349

  • Wayne Xin Zhao, Jing Jiang, +5 authors traditional media using topic models. In Advances in Informati Retrieval
  • Springer,
  • 2011
Highly Influential
8 Excerpts

pages 80–88

  • Liangjie Hong, Brian D Davison. Empirical study of topic modeling in tw Analytics
  • ACM,
  • 2010
Highly Influential
4 Excerpts

In Proceedings of the 22nd International Conference on World Wide Web

  • Xiaohui Yan, Jiafeng Guo, Yanyan Lan, Xueqi Cheng. A biterm topic model for short texts
  • pages 1445–1456. International World Wide Web…
  • 2013
2 Excerpts

Similar Papers

Loading similar papers…