Hierarchical Dirichlet Processes


a grant from Darpa in support of the CALO program. The authors wish to acknowledge helpful discussions with Lancelot James and Jim Pitman and the referees for useful comments. Abstract We consider problems involving groups of data, where each observation within a group is a draw from a mixture model, and where it is desirable to share mixture components between groups. We assume that the number of mixture components is unknown a priori and is to be inferred from the data. In this setting it is natural to consider sets of Dirichlet processes, one for each group, where the well-known clustering property of the Dirichlet process provides a nonparametric prior for the number of mixture components within each group. Given our desire to tie the mixture models in the various groups, we consider a hierarchical model, specifically one in which the base measure for the child Dirichlet processes is itself distributed according to a Dirichlet process. Such a base measure being discrete, the child Dirichlet processes necessarily share atoms. Thus, as desired, the mixture models in the different groups necessarily share mixture components. We discuss representations of hierarchical Dirichlet processes in terms of a stick-breaking process, and a generalization of the Chinese restaurant process that we refer to as the " Chinese restaurant franchise. " We present Markov chain Monte Carlo algorithms for posterior inference in hierarchical Dirichlet process mixtures, and describe applications to problems in information retrieval and text modelling.

Extracted Key Phrases

Showing 1-10 of 40 references

A Method for Combining Inference Across Related Nonparametric Bayesian Models

  • P Müller, F Quintana, G Rosner
  • 2004

An ANOVA Model for Dependent Random Measures

  • De Iorio, M Müller, P Rosner
  • 2004

Exact and Approximate Sum-Representations for the Dirichlet Process

  • H Ishwaran, M Zarepour
  • 2002

Reanalyzing Ultimatum Bargaining—Comparing Nondecreasing Curves Without Shape Constraints

  • D K H Fong, S E Pammer, S F Arnold, G E Bolton
  • 2002
Showing 1-10 of 1,597 extracted citations
Citations per Year

2,881 Citations

Semantic Scholar estimates that this publication has received between 2,640 and 3,145 citations based on the available data.

See our FAQ for additional information.