• Publications
  • Influence
Infinite latent feature models and the Indian buffet process
We define a probability distribution over equivalence classes of binary matrices with a finite number of rows and an unbounded number of columns. This distribution is suitable for use as a prior inExpand
  • 746
  • 186
  • Open Access
The Author-Topic Model for Authors and Documents
We introduce the author-topic model, a generative model for documents that extends Latent Dirichlet Allocation (LDA; Blei, Ng, & Jordan, 2003) to include authorship information. Each author isExpand
  • 1,420
  • 128
  • Open Access
Hierarchical Topic Models and the Nested Chinese Restaurant Process
We address the problem of learning topic hierarchies from data. The model selection problem in this domain is daunting—which of the large collection of possible trees to use? We take a BayesianExpand
  • 954
  • 110
  • Open Access
Topics in semantic representation.
Processing language requires the retrieval of concepts from memory in response to an ongoing stream of information. This retrieval is facilitated if one can infer the gist of a sentence,Expand
  • 874
  • 106
  • Open Access
Learning Systems of Concepts with an Infinite Relational Model
Relationships between concepts account for a large proportion of semantic knowledge. We present a nonparametric Bayesian model that discovers systems of related concepts. Given data involving severalExpand
  • 657
  • 101
  • Open Access
Probabilistic Topic Models
  • 902
  • 95
The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies
We present the nested Chinese restaurant process (nCRP), a stochastic process that assigns probability distributions to ensembles of infinitely deep, infinitely branching trees. We show how thisExpand
  • 558
  • 80
  • Open Access
The Indian Buffet Process: An Introduction and Review
The Indian buffet process is a stochastic process defining a probability distribution over equivalence classes of sparse binary matrices with a finite number of rows and an unbounded number ofExpand
  • 339
  • 74
  • Open Access
Probabilistic author-topic models for information discovery
We propose a new unsupervised learning technique for extracting information from large text collections. We model documents as if they were generated by a two-stage stochastic process. Each author isExpand
  • 621
  • 62
  • Open Access
A Bayesian framework for word segmentation: Exploring the effects of context
Since the experiments of Saffran et al. [Saffran, J., Aslin, R., & Newport, E. (1996). Statistical learning in 8-month-old infants. Science, 274, 1926-1928], there has been a great deal of interestExpand
  • 383
  • 57
  • Open Access