Anna Potapenko

Learn More
Probabilistic topic modeling of text collections is a powerful tool for statistical text analysis. In this tutorial we introduce a novel non-Bayesian approach, called Additive Regularization of Topic Models. ARTM is free of redundant probabilistic assumptions and provides a simple inference for many combined and multi-objective topic models.
Probabilistic topic modeling of text collections is a powerful tool for statistical text analysis. Determining the optimal number of topics remains a challenging problem in topic modeling. We propose a simple entropy regularization for topic selection in terms of Additive Regularization of Topic Models (ARTM), a multicriteria approach for combining(More)
  • Воронцов К В, Вычислительный Центр, Им А А Дородницына Ран, Московский Физико-Технический, Россия Институт, А А Потапенко +4 others
  • 2014
Probabilistic topic modeling is a rapidly developing branch of statistical text analysis. Topic model uncovers a hidden thematic structure of the text collection. Learning a topic model from a document collection has an infinite set of solutions. The nonuniqueness results in a weak interpretability and an instability of the solution. To tackle these(More)
In this paper we introduce a generalized learning algorithm for probabilistic topic models (PTM). Many known and new algorithms for PLSA, LDA, and SWB models can be obtained as its special cases by choosing a subset of the following " options " : regularization, sampling, update frequency, sparsing and robustness. We show that a robust topic model, which(More)
  • 1