Nonparametric Bayes Pachinko Allocation

@inproceedings{Li2007NonparametricBP,
  title={Nonparametric Bayes Pachinko Allocation},
  author={Wei Li and David M. Blei and Andrew McCallum},
  booktitle={UAI},
  year={2007}
}
Recent advances in topic models have explored complicated structured distributions to represent topic correlation. For example, the pachinko allocation model (PAM) captures arbitrary, nested, and possibly sparse correlations between topics using a directed acyclic graph (DAG). While PAM provides more flexibility and greater expressive power than previous models like latent Dirichlet allocation (LDA), it is also more difficult to determine the appropriate topic structure for a specific dataset… Expand
The generalized dirichlet distribution in enhanced topic detection
TLDR
A new, robust and computationally efficient Hierarchical Bayesian model for effective topic correlation modeling that captures correlations between topics, is faster to infer than CTM and PAM, and is effective to avoid over-fitting as the number of topics is increased. Expand
Correlation between the Topic and Documents Based on the Pachinko Allocation Model
TLDR
The pachinko allocation model (PAM) is proposed, which improves upon earlier topic models such as LDA by modeling correlations between topics in addition to the word correlations which constitute topics. Expand
The Doubly Correlated Nonparametric Topic Model
TLDR
A doubly correlated nonparametric topic (DCNT) model is proposed, the first model to simultaneously capture all three of these properties and validate the semantic structure and predictive performance of the DCNT using a corpus of NIPS documents annotated by various metadata. Expand
Variational Inference In Pachinko Allocation Machines
TLDR
This paper presents an efficient and flexible amortized variational inference method for PAM, using a deep inference network to parameterize the approximate posterior distribution in a manner similar to the variational autoencoder. Expand
Mixtures of hierarchical topics with Pachinko allocation
TLDR
H hierarchical PAM is presented---an enhancement that explicitly represents a topic hierarchy that can be seen as combining the advantages of hLDA's topical hierarchy representation with PAM's ability to mix multiple leaves of the topic hierarchy. Expand
The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies
TLDR
An application to information retrieval in which documents are modeled as paths down a random tree, and the preferential attachment dynamics of the nCRP leads to clustering of documents according to sharing of topics at multiple levels of abstraction. Expand
The nested Chinese restaurant process and hierarchical topic models
TLDR
An application to information retrieval in which documents are modeled as paths down a random tree, and the preferential attachment dynamics of the nCRP leads to clustering of documents according to sharing of topics at multiple levels of abstraction. Expand
The nested Chinese restaurant process and Bayesian inference of topic hierarchies
TLDR
An application to information retrieval in which documents are modeled as paths down a random tree, and the preferential attachment dynamics of the nCRP leads to clustering of documents according to sharing of topics at multiple levels of abstraction. Expand
The Supervised Hierarchical Dirichlet Process
  • Andrew M. Dai, A. Storkey
  • Mathematics, Computer Science
  • IEEE Transactions on Pattern Analysis and Machine Intelligence
  • 2015
We propose the supervised hierarchical Dirichlet process (sHDP), a nonparametric generative model for the joint distribution of a group of observations and a response variable directly associatedExpand
Dyamic Stacked Topic Model
TLDR
The Dynamic Stacked Topic Model proposed here is a topic model, for analyzing the hierarchical structure and the time evolution of topics in document collections, and the inference and parameter estimation processes can be achieved by a stochastic EM algorithm. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 15 REFERENCES
Pachinko allocation: DAG-structured mixture models of topic correlations
TLDR
Improved performance of PAM is shown in document classification, likelihood of held-out data, the ability to support finer-grained topics, and topical keyword coherence. Expand
Correlated Topic Models
TLDR
The correlated topic model (CTM) is developed, where the topic proportions exhibit correlation via the logistic normal distribution and a mean-field variational inference algorithm is derived for approximate posterior inference in this model, which is complicated by the fact that the Logistic normal is not conjugate to the multinomial. Expand
Hierarchical Topic Models and the Nested Chinese Restaurant Process
TLDR
A Bayesian approach is taken to generate an appropriate prior via a distribution on partitions that allows arbitrarily large branching factors and readily accommodates growing data collections. Expand
Latent Dirichlet Allocation
We propose a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams [6], andExpand
A Bayesian Model for Supervised Clustering with the Dirichlet Process Prior
TLDR
A Bayesian framework for tackling the supervised clustering problem, the generic problem encountered in tasks such as reference matching, coreference resolution, identity uncertainty and record linkage, is developed and is able to outperform other models across a variety of tasks and performance metrics. Expand
Finding scientific topics
  • T. Griffiths, M. Steyvers
  • Computer Science, Medicine
  • Proceedings of the National Academy of Sciences of the United States of America
  • 2004
TLDR
A generative model for documents is described, introduced by Blei, Ng, and Jordan, and a Markov chain Monte Carlo algorithm is presented for inference in this model, which is used to analyze abstracts from PNAS by using Bayesian model selection to establish the number of topics. Expand
Hierarchical Dirichlet Processes
We consider problems involving groups of data where each observation within a group is a draw from a mixture model and where it is desirable to share mixture components between groups. We assume thatExpand
Variable selection in clustering via Dirichlet process mixture models
TLDR
This paper introduces a latent binary vector to identify discriminating variables and use Dirichlet process mixture models to define the cluster structure and updates the variable selection index using a Metropolis algorithm and obtains inference on the clusters structure via a split-merge Markov chain Monte Carlo technique. Expand
Bayesian Haplotype Inference via the Dirichlet Process
TLDR
A Bayesian approach to the problem of inferring haplotypes from genotypes of single nucleotide polymorphisms (SNPs) based on a nonparametric prior known as the Dirichlet process is presented, which is reminiscent of parsimony methods in its preference for small haplotype pools. Expand
Bayesian haplo-type inference via the dirichlet process
TLDR
A Bayesian approach to the problem of inferring haplotypes from genotypes of single nucleotide polymorphisms based on a nonparametric prior known as the Dirichlet process is presented, which incorporates a likelihood that captures statistical errors in the haplotype/genotype relationship. Expand
...
1
2
...