Probabilistic Topic Models
@article{Blei2010ProbabilisticTM, title={Probabilistic Topic Models}, author={David M. Blei}, journal={IEEE Signal Processing Magazine}, year={2010}, volume={27}, pages={55-65} }
In this article, we review probabilistic topic models: graphical models that can be used to summarize a large collection of documents with a smaller number of distributions over words. [] Key Method We discuss two extensions of topic models to time-series data-one that lets the topics slowly change over time and one that lets the assumed prevalence of the topics change. Finally, we illustrate the application of topic models to nontext data, summarizing some recent research results in image analysis.
606 Citations
Continuous-time Infinite Dynamic Topic Models
- Computer ScienceArXiv
- 2013
This dissertation presents a model, the continuous-time infinite dynamic topic model, that combines the advantages of these two models 1) the online-hierarchical Dirichlet process, and 2) the Continuous-time dynamic topic models.
Probabilistic topic models for sequence data
- Computer ScienceMachine Learning
- 2013
The popular Latent Dirichlet Allocation model is extended, by exploiting three different conditional Markovian assumptions, to extend the performance advantages of sequence-modeling approaches over real-word data.
Stochastic Variational Optimization of a Hierarchical Dirichlet Process Latent Beta-Liouville Topic Model
- Computer ScienceACM Trans. Knowl. Discov. Data
- 2022
This article addresses the problem related to model selection and sharing ability of topics across multiple documents in standard parametric topic models, and proposes as an alternative a BNP (Bayesian nonparametric) topic model where the HDP (hierarchical Dirichlet process) prior models documents topic mixtures through their multinomials on infinite simplex.
TopicBank: Collection of coherent topics using multiple model training with their further use for topic model validation
- Computer ScienceData Knowl. Eng.
- 2021
topicmodels: An R Package for Fitting Topic Models
- Computer Science
- 2021
This article is a (slightly) modified and shortened version of Gr¨un and Hornik (2011), published in the Journal of Statistical Software . Topic models allow the probabilistic modeling of term…
Topic Modeling Using Latent Dirichlet allocation
- Computer ScienceACM Comput. Surv.
- 2022
The preliminaries of the topic modeling techniques are introduced and its extensions and variations, such as topic modeling over various domains, hierarchical topic modeling, word embedded topic models, and topic models in multilingual perspectives are reviewed.
Probabilistic Topic Models
- Computer SciencePractical Text Analytics
- 2018
In this chapter, the reader is introduced to an unsupervised, probabilistic analysis model known as topic models, where the topic distribution over terms and the document distribution over topics are broken down into two major components.
A Nonparametric N-Gram Topic Model with Interpretable Latent Topics
- Computer ScienceAIRS
- 2013
This work presents a new nonparametric topic model that not only maintains the word order in the topic discovery process, but also generates topical n-gram words leading to more interpretable latent topics in the family of the non parametric topic models.
Linguistic extensions of topic models
- Computer Science
- 2010
This thesis extends LDA in three different ways: adding knowledge of word meaning, modeling multiple languages, and incorporating local syntactic context, which offers a new method of using topic models on corpora with multiple languages.
References
SHOWING 1-10 OF 122 REFERENCES
Hierarchical Bayesian Modeling of Topics in Time-Stamped Documents
- Computer ScienceIEEE Transactions on Pattern Analysis and Machine Intelligence
- 2010
This work considers the problem of inferring and modeling topics in a sequence of documents with known publication dates as well as the US Presidential State of the Union addresses from 1790 to 2008, and proposes a hierarchical model that infers the change in the topic mixture weights as a function of time.
Topic Models Conditioned on Arbitrary Features with Dirichlet-multinomial Regression
- Computer ScienceUAI
- 2008
A Dirichlet-multinomial regression topic model that includes a log-linear prior on document-topic distributions that is a function of observed features of the document, such as author, publication venue, references, and dates is proposed.
Dynamic topic models
- Computer ScienceICML
- 2006
A family of probabilistic time series models is developed to analyze the time evolution of topics in large document collections, and dynamic topic models provide a qualitative window into the contents of a large document collection.
Markov Topic Models
- Computer ScienceAISTATS
- 2009
Markov topic models are developed, a novel family of generative probabilistic models that can learn topics simultaneously from multiple corpora, such as papers from different conferences, and improve quantitative performance over the state of the art.
Topic modeling: beyond bag-of-words
- Computer ScienceICML
- 2006
A hierarchical generative probabilistic model that incorporates both n-gram statistics and latent topic variables by extending a unigram topic model to include properties of a hierarchical Dirichlet bigram language model is explored.
A correlated topic model of Science
- Computer Science
- 2007
The correlated topic model (CTM) is developed, where the topic proportions exhibit correlation via the logistic normal distribution, and it is demonstrated its use as an exploratory tool of large document collections.
The Author-Topic Model for Authors and Documents
- Computer ScienceUAI
- 2004
The author-topic model is introduced, a generative model for documents that extends Latent Dirichlet Allocation to include authorship information, and applications to computing similarity between authors and entropy of author output are demonstrated.
Continuous Time Dynamic Topic Models
- Computer ScienceUAI
- 2008
An efficient variational approximate inference algorithm is derived that takes advantage of the sparsity of observations in text, a property that lets us easily handle many time points.
Finding scientific topics
- Computer ScienceProceedings of the National Academy of Sciences of the United States of America
- 2004
A generative model for documents is described, introduced by Blei, Ng, and Jordan, and a Markov chain Monte Carlo algorithm is presented for inference in this model, which is used to analyze abstracts from PNAS by using Bayesian model selection to establish the number of topics.
Topics over time: a non-Markov continuous-time model of topical trends
- Computer ScienceKDD '06
- 2006
An LDA-style topic model is presented that captures not only the low-dimensional structure of data, but also how the structure changes over time, showing improved topics, better timestamp prediction, and interpretable trends.