# Nonparametric Bayes Pachinko Allocation

@inproceedings{Li2007NonparametricBP, title={Nonparametric Bayes Pachinko Allocation}, author={Wei Li and David M. Blei and Andrew McCallum}, booktitle={UAI}, year={2007} }

Recent advances in topic models have explored complicated structured distributions to represent topic correlation. For example, the pachinko allocation model (PAM) captures arbitrary, nested, and possibly sparse correlations between topics using a directed acyclic graph (DAG). While PAM provides more flexibility and greater expressive power than previous models like latent Dirichlet allocation (LDA), it is also more difficult to determine the appropriate topic structure for a specific datasetâ€¦Â Expand

#### Topics from this paper

#### 90 Citations

The generalized dirichlet distribution in enhanced topic detection

- Computer Science
- CIKM
- 2012

A new, robust and computationally efficient Hierarchical Bayesian model for effective topic correlation modeling that captures correlations between topics, is faster to infer than CTM and PAM, and is effective to avoid over-fitting as the number of topics is increased. Expand

Correlation between the Topic and Documents Based on the Pachinko Allocation Model

- Computer Science
- 2016

The pachinko allocation model (PAM) is proposed, which improves upon earlier topic models such as LDA by modeling correlations between topics in addition to the word correlations which constitute topics. Expand

The Doubly Correlated Nonparametric Topic Model

- Computer Science
- NIPS
- 2011

A doubly correlated nonparametric topic (DCNT) model is proposed, the first model to simultaneously capture all three of these properties and validate the semantic structure and predictive performance of the DCNT using a corpus of NIPS documents annotated by various metadata. Expand

Variational Inference In Pachinko Allocation Machines

- Computer Science, Mathematics
- ArXiv
- 2018

This paper presents an efficient and flexible amortized variational inference method for PAM, using a deep inference network to parameterize the approximate posterior distribution in a manner similar to the variational autoencoder. Expand

Mixtures of hierarchical topics with Pachinko allocation

- Computer Science
- ICML '07
- 2007

H hierarchical PAM is presented---an enhancement that explicitly represents a topic hierarchy that can be seen as combining the advantages of hLDA's topical hierarchy representation with PAM's ability to mix multiple leaves of the topic hierarchy. Expand

The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies

- Computer Science, Mathematics
- JACM
- 2010

An application to information retrieval in which documents are modeled as paths down a random tree, and the preferential attachment dynamics of the nCRP leads to clustering of documents according to sharing of topics at multiple levels of abstraction. Expand

The nested Chinese restaurant process and hierarchical topic models

- Computer Science
- 2007

An application to information retrieval in which documents are modeled as paths down a random tree, and the preferential attachment dynamics of the nCRP leads to clustering of documents according to sharing of topics at multiple levels of abstraction. Expand

The nested Chinese restaurant process and Bayesian inference of topic hierarchies

- Computer Science
- 2007

An application to information retrieval in which documents are modeled as paths down a random tree, and the preferential attachment dynamics of the nCRP leads to clustering of documents according to sharing of topics at multiple levels of abstraction. Expand

The Supervised Hierarchical Dirichlet Process

- Mathematics, Computer Science
- IEEE Transactions on Pattern Analysis and Machine Intelligence
- 2015

We propose the supervised hierarchical Dirichlet process (sHDP), a nonparametric generative model for the joint distribution of a group of observations and a response variable directly associatedâ€¦ Expand

Dyamic Stacked Topic Model

- Computer Science
- 2016

The Dynamic Stacked Topic Model proposed here is a topic model, for analyzing the hierarchical structure and the time evolution of topics in document collections, and the inference and parameter estimation processes can be achieved by a stochastic EM algorithm. Expand

#### References

SHOWING 1-10 OF 15 REFERENCES

Pachinko allocation: DAG-structured mixture models of topic correlations

- Computer Science
- ICML
- 2006

Improved performance of PAM is shown in document classification, likelihood of held-out data, the ability to support finer-grained topics, and topical keyword coherence. Expand

Correlated Topic Models

- Computer Science, Mathematics
- NIPS
- 2005

The correlated topic model (CTM) is developed, where the topic proportions exhibit correlation via the logistic normal distribution and a mean-field variational inference algorithm is derived for approximate posterior inference in this model, which is complicated by the fact that the Logistic normal is not conjugate to the multinomial. Expand

Hierarchical Topic Models and the Nested Chinese Restaurant Process

- Computer Science
- NIPS
- 2003

A Bayesian approach is taken to generate an appropriate prior via a distribution on partitions that allows arbitrarily large branching factors and readily accommodates growing data collections. Expand

Latent Dirichlet Allocation

- Computer Science
- J. Mach. Learn. Res.
- 2003

We propose a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams [6], andâ€¦ Expand

A Bayesian Model for Supervised Clustering with the Dirichlet Process Prior

- Computer Science, Mathematics
- J. Mach. Learn. Res.
- 2005

A Bayesian framework for tackling the supervised clustering problem, the generic problem encountered in tasks such as reference matching, coreference resolution, identity uncertainty and record linkage, is developed and is able to outperform other models across a variety of tasks and performance metrics. Expand

Finding scientific topics

- Computer Science, Medicine
- Proceedings of the National Academy of Sciences of the United States of America
- 2004

A generative model for documents is described, introduced by Blei, Ng, and Jordan, and a Markov chain Monte Carlo algorithm is presented for inference in this model, which is used to analyze abstracts from PNAS by using Bayesian model selection to establish the number of topics. Expand

Hierarchical Dirichlet Processes

- Mathematics
- 2006

We consider problems involving groups of data where each observation within a group is a draw from a mixture model and where it is desirable to share mixture components between groups. We assume thatâ€¦ Expand

Variable selection in clustering via Dirichlet process mixture models

- Mathematics, Computer Science
- 2006

This paper introduces a latent binary vector to identify discriminating variables and use Dirichlet process mixture models to define the cluster structure and updates the variable selection index using a Metropolis algorithm and obtains inference on the clusters structure via a split-merge Markov chain Monte Carlo technique. Expand

Bayesian Haplotype Inference via the Dirichlet Process

- Mathematics, Medicine
- J. Comput. Biol.
- 2007

A Bayesian approach to the problem of inferring haplotypes from genotypes of single nucleotide polymorphisms (SNPs) based on a nonparametric prior known as the Dirichlet process is presented, which is reminiscent of parsimony methods in its preference for small haplotype pools. Expand

Bayesian haplo-type inference via the dirichlet process

- Mathematics, Computer Science
- ICML
- 2004

A Bayesian approach to the problem of inferring haplotypes from genotypes of single nucleotide polymorphisms based on a nonparametric prior known as the Dirichlet process is presented, which incorporates a likelihood that captures statistical errors in the haplotype/genotype relationship. Expand