• Publications
  • Influence
Latent Dirichlet Allocation
Hierarchical Dirichlet Processes
TLDR
This work considers problems involving groups of data where each observation within a group is a draw from a mixture model and where it is desirable to share mixture components between groups, and considers a hierarchical model, specifically one in which the base measure for the childDirichlet processes is itself distributed according to a Dirichlet process.
Probabilistic Topic Models
  • D. Blei
  • Computer Science
    IEEE Signal Processing Magazine
  • 18 October 2010
TLDR
Surveying a suite of algorithms that offer a solution to managing large document archives suggests they are well-suited to handle large amounts of data.
Variational Inference: A Review for Statisticians
TLDR
Variational inference (VI), a method from machine learning that approximates probability densities through optimization, is reviewed and a variant that uses stochastic optimization to scale up to massive data is derived.
Stochastic variational inference
TLDR
Stochastic variational inference lets us apply complex Bayesian models to massive data sets, and it is shown that the Bayesian nonparametric topic model outperforms its parametric counterpart.
Mixed Membership Stochastic Blockmodels
TLDR
This paper describes a latent variable model of such data called the mixed membership stochastic blockmodel, which extends blockmodels for relational data to ones which capture mixed membership latent relational structure, thus providing an object-specific low-dimensional representation.
Dynamic topic models
TLDR
A family of probabilistic time series models is developed to analyze the time evolution of topics in large document collections, and dynamic topic models provide a qualitative window into the contents of a large document collection.
Online Learning for Latent Dirichlet Allocation
TLDR
An online variational Bayes (VB) algorithm for Latent Dirichlet Allocation (LDA) based on online stochastic optimization with a natural gradient step is developed, which shows converges to a local optimum of the VB objective function.
Collaborative topic modeling for recommending scientific articles
TLDR
An algorithm to recommend scientific articles to users of an online community that combines the merits of traditional collaborative filtering and probabilistic topic modeling and can form recommendations about both existing and newly published articles is developed.
Supervised Topic Models
TLDR
The supervised latent Dirichlet allocation (sLDA) model, a statistical model of labelled documents, is introduced, which derives a maximum-likelihood procedure for parameter estimation, which relies on variational approximations to handle intractable posterior expectations.
...
...