Probabilistic Topic Models

@article{Blei2010ProbabilisticTM,
  title={Probabilistic Topic Models},
  author={David M. Blei and Lawrence Carin and David B. Dunson},
  journal={IEEE Signal Processing Magazine},
  year={2010},
  volume={27},
  pages={55-65}
}
In this article, we review probabilistic topic models: graphical models that can be used to summarize a large collection of documents with a smaller number of distributions over words. [...] Key Method We discuss two extensions of topic models to time-series data-one that lets the topics slowly change over time and one that lets the assumed prevalence of the topics change. Finally, we illustrate the application of topic models to nontext data, summarizing some recent research results in image analysis.Expand
Probabilistic Topic Models
TLDR
In this chapter, the reader is introduced to an unsupervised, probabilistic analysis model known as topic models, where the topic distribution over terms and the document distribution over topics are broken down into two major components. Expand
A Survey of Topic Model Inference Techniques
TLDR
This paper investigates topic modeling literature based on LDA and presents discoveries and state of the art in the topic, as well as challenges and popular tools. Expand
Bayesian Nonparametric Relational Topic Model through Dependent Gamma Processes
TLDR
A nonparametric relational topic model using stochastic processes instead of fixed-dimensional probability distributions is proposed in this paper, which can discover the hidden topics and its number simultaneously. Expand
Multi-objective Topic Modeling
TLDR
Comparisons with LDA show that adoption of MOEA approaches enables significantly more coherent topics than LDA, consequently enhancing the use and interpretability of these models in a range of applications, without significant degradation in generalization ability. Expand
Fast and modular regularized topic modelling
TLDR
A non-Bayesian multiobjective approach called the Additive Regularization of Topic Models (ARTM) is developed, based on regularized Maximum Likelihood Estimation (MLE), and it is shown that many of the well-known Bayesian topic models can be re-formulated in a much simpler way using the regularization point of view. Expand
Diagnosing and Improving Topic Models by Analyzing Posterior Variability
TLDR
This work proposes a metric called topic stability that measures the variability of the topic parameters under the posterior and shows that this metric is correlated with human judgments of topic quality as well as with the consistency of topics appearing across multiple models. Expand
Group topic model: organizing topics into groups
TLDR
The proposed group latent Dirichlet allocation (GLDA) model is evaluated for topic modeling and document clustering and the experimental results indicated that GLDA can achieve a competitive performance when compared with state-of-the-art approaches. Expand
Full-Text or Abstract? Examining Topic Coherence Scores Using Latent Dirichlet Allocation
  • Shaheen Syed, M. Spruit
  • Computer Science
  • 2017 IEEE International Conference on Data Science and Advanced Analytics (DSAA)
  • 2017
TLDR
It is shown that document frequency, document word length, and vocabulary size have mixed practical effects on topic coherence and human topic ranking of LDA topics, and that large document collections are less affected by incorrect or noise terms being part of the topic-word distributions, causing topics to be more coherent and ranked higher. Expand
Topic Classification Through Topic Modeling with Additive Regularization for Collection of Scientific Papers
TLDR
The Author has suggested additive regularization when creating models to single out topic clusters from the Probabilistic Latent Semantic Analysis, PLSA, which allows singling out topic classes based on their density in the document-topic space (Matrix Θ) for a selected collection of documents. Expand
Topic modeling for conference analytics
TLDR
This work presents the attempt to understand the research topics that characterize the papers submitted to a conference, by using topic modeling and data visualization techniques, and compares the automatically inferred topics against the expert-defined topics. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 72 REFERENCES
Hierarchical Bayesian Modeling of Topics in Time-Stamped Documents
TLDR
This work considers the problem of inferring and modeling topics in a sequence of documents with known publication dates as well as the US Presidential State of the Union addresses from 1790 to 2008, and proposes a hierarchical model that infers the change in the topic mixture weights as a function of time. Expand
Topic Models Conditioned on Arbitrary Features with Dirichlet-multinomial Regression
TLDR
A Dirichlet-multinomial regression topic model that includes a log-linear prior on document-topic distributions that is a function of observed features of the document, such as author, publication venue, references, and dates is proposed. Expand
Dynamic topic models
TLDR
A family of probabilistic time series models is developed to analyze the time evolution of topics in large document collections, and dynamic topic models provide a qualitative window into the contents of a large document collection. Expand
Continuous Time Dynamic Topic Models
TLDR
An efficient variational approximate inference algorithm is derived that takes advantage of the sparsity of observations in text, a property that lets us easily handle many time points. Expand
Finding scientific topics
  • T. Griffiths, M. Steyvers
  • Computer Science, Medicine
  • Proceedings of the National Academy of Sciences of the United States of America
  • 2004
TLDR
A generative model for documents is described, introduced by Blei, Ng, and Jordan, and a Markov chain Monte Carlo algorithm is presented for inference in this model, which is used to analyze abstracts from PNAS by using Bayesian model selection to establish the number of topics. Expand
Topics over time: a non-Markov continuous-time model of topical trends
TLDR
An LDA-style topic model is presented that captures not only the low-dimensional structure of data, but also how the structure changes over time, showing improved topics, better timestamp prediction, and interpretable trends. Expand
Hierarchical relational models for document networks
TLDR
The relational topic model (RTM), a hierarchical model of both network structure and node attributes, is developed, where the attributes of each document are its words, that is, discrete observations taken from a fixed vocabulary. Expand
Latent Dirichlet Allocation
We propose a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams [6], andExpand
Reading Tea Leaves: How Humans Interpret Topic Models
TLDR
New quantitative methods for measuring semantic meaning in inferred topics are presented, showing that they capture aspects of the model that are undetected by previous measures of model quality based on held-out likelihood. Expand
Topic Models
Here, K is the number of components in the mixture model. For each k, f(x; θk) is the pdf of component number k. The scalar αk is the proportion of component number k. The specific topic model weExpand
...
1
2
3
4
5
...