Probabilistic Topic Models

@article{Blei2010ProbabilisticTM,
  title={Probabilistic Topic Models},
  author={David M. Blei},
  journal={IEEE Signal Processing Magazine},
  year={2010},
  volume={27},
  pages={55-65}
}
  • D. Blei
  • Published 18 October 2010
  • Computer Science
  • IEEE Signal Processing Magazine
In this article, we review probabilistic topic models: graphical models that can be used to summarize a large collection of documents with a smaller number of distributions over words. [] Key Method We discuss two extensions of topic models to time-series data-one that lets the topics slowly change over time and one that lets the assumed prevalence of the topics change. Finally, we illustrate the application of topic models to nontext data, summarizing some recent research results in image analysis.
Continuous-time Infinite Dynamic Topic Models
TLDR
This dissertation presents a model, the continuous-time infinite dynamic topic model, that combines the advantages of these two models 1) the online-hierarchical Dirichlet process, and 2) the Continuous-time dynamic topic models.
Probabilistic topic models for sequence data
TLDR
The popular Latent Dirichlet Allocation model is extended, by exploiting three different conditional Markovian assumptions, to extend the performance advantages of sequence-modeling approaches over real-word data.
Stochastic Variational Optimization of a Hierarchical Dirichlet Process Latent Beta-Liouville Topic Model
TLDR
This article addresses the problem related to model selection and sharing ability of topics across multiple documents in standard parametric topic models, and proposes as an alternative a BNP (Bayesian nonparametric) topic model where the HDP (hierarchical Dirichlet process) prior models documents topic mixtures through their multinomials on infinite simplex.
topicmodels: An R Package for Fitting Topic Models
This article is a (slightly) modified and shortened version of Gr¨un and Hornik (2011), published in the Journal of Statistical Software . Topic models allow the probabilistic modeling of term
Topic Modeling Using Latent Dirichlet allocation
TLDR
The preliminaries of the topic modeling techniques are introduced and its extensions and variations, such as topic modeling over various domains, hierarchical topic modeling, word embedded topic models, and topic models in multilingual perspectives are reviewed.
Probabilistic Topic Models
TLDR
In this chapter, the reader is introduced to an unsupervised, probabilistic analysis model known as topic models, where the topic distribution over terms and the document distribution over topics are broken down into two major components.
A Nonparametric N-Gram Topic Model with Interpretable Latent Topics
TLDR
This work presents a new nonparametric topic model that not only maintains the word order in the topic discovery process, but also generates topical n-gram words leading to more interpretable latent topics in the family of the non parametric topic models.
Linguistic extensions of topic models
TLDR
This thesis extends LDA in three different ways: adding knowledge of word meaning, modeling multiple languages, and incorporating local syntactic context, which offers a new method of using topic models on corpora with multiple languages.
...
...

References

SHOWING 1-10 OF 122 REFERENCES
Hierarchical Bayesian Modeling of Topics in Time-Stamped Documents
TLDR
This work considers the problem of inferring and modeling topics in a sequence of documents with known publication dates as well as the US Presidential State of the Union addresses from 1790 to 2008, and proposes a hierarchical model that infers the change in the topic mixture weights as a function of time.
Topic Models Conditioned on Arbitrary Features with Dirichlet-multinomial Regression
TLDR
A Dirichlet-multinomial regression topic model that includes a log-linear prior on document-topic distributions that is a function of observed features of the document, such as author, publication venue, references, and dates is proposed.
Dynamic topic models
TLDR
A family of probabilistic time series models is developed to analyze the time evolution of topics in large document collections, and dynamic topic models provide a qualitative window into the contents of a large document collection.
Markov Topic Models
TLDR
Markov topic models are developed, a novel family of generative probabilistic models that can learn topics simultaneously from multiple corpora, such as papers from different conferences, and improve quantitative performance over the state of the art.
Topic modeling: beyond bag-of-words
TLDR
A hierarchical generative probabilistic model that incorporates both n-gram statistics and latent topic variables by extending a unigram topic model to include properties of a hierarchical Dirichlet bigram language model is explored.
A correlated topic model of Science
TLDR
The correlated topic model (CTM) is developed, where the topic proportions exhibit correlation via the logistic normal distribution, and it is demonstrated its use as an exploratory tool of large document collections.
The Author-Topic Model for Authors and Documents
TLDR
The author-topic model is introduced, a generative model for documents that extends Latent Dirichlet Allocation to include authorship information, and applications to computing similarity between authors and entropy of author output are demonstrated.
Continuous Time Dynamic Topic Models
TLDR
An efficient variational approximate inference algorithm is derived that takes advantage of the sparsity of observations in text, a property that lets us easily handle many time points.
Finding scientific topics
  • T. Griffiths, M. Steyvers
  • Computer Science
    Proceedings of the National Academy of Sciences of the United States of America
  • 2004
TLDR
A generative model for documents is described, introduced by Blei, Ng, and Jordan, and a Markov chain Monte Carlo algorithm is presented for inference in this model, which is used to analyze abstracts from PNAS by using Bayesian model selection to establish the number of topics.
Topics over time: a non-Markov continuous-time model of topical trends
TLDR
An LDA-style topic model is presented that captures not only the low-dimensional structure of data, but also how the structure changes over time, showing improved topics, better timestamp prediction, and interpretable trends.
...
...