Learn More
A first step in identifying the content of a document is determining which topics that document addresses. We describe a generative model for documents, introduced by Blei, Ng, and Jordan [Blei, D. M., Ng, A. Y. & Jordan, M. I. (2003) J. Machine Learn. Res. 3, 993-1022], in which each document is generated by choosing a distribution over topics and then(More)
We define a probability distribution over equivalence classes of binary matrices with a finite number of rows and an unbounded number of columns. This distribution is suitable for use as a prior in probabilistic models that represent objects using a potentially infinite array of features. We identify a simple generative process that results in the same(More)
We address the problem of learning topic hierarchies from data. The model selection problem in this domain is daunting—which of the large collection of possible trees to use? We take a Bayesian approach, generating an appropriate prior via a distribution on partitions that we refer to as the nested Chinese restaurant process. This nonparametric prior allows(More)
Processing language requires the retrieval of concepts from memory in response to an ongoing stream of information. This retrieval is facilitated if one can infer the gist of a sentence, conversation, or document and use that gist to predict related concepts and disambiguate words. This article analyzes the abstract computational problem underlying the(More)
We introduce the author-topic model, a gen-erative model for documents that extends Latent Dirichlet Allocation (LDA; Blei, Ng, & Jordan, 2003) to include authorship information. Each author is associated with a multi-nomial distribution over topics and each topic is associated with a multinomial distribution over words. A document with multiple authors is(More)
Relationships between concepts account for a large proportion of semantic knowledge. We present a nonpara-metric Bayesian model that discovers systems of related concepts. Given data involving several sets of entities, our model discovers the kinds of entities in each set and the relations between kinds that are possible or likely. We apply our approach to(More)
Unsupervised learning of linguistic structure is a difficult problem. A common approach is to define a generative model and maximize the probability of the hidden structure given the observed data. Typically, this is done using maximum-likelihood estimation (MLE) of the model parameters. We show using part-of-speech tagging that a fully Bayesian approach(More)
We present the nested Chinese restaurant process (nCRP), a stochastic process that assigns probability distributions to ensembles of infinitely deep, infinitely branching trees. We show how this stochastic process can be used as a prior distribution in a Bayesian nonparametric model of document collections. Specifically, we present an application to(More)
Since the experiments of Saffran et al. [Saffran, J., Aslin, R., & Newport, E. (1996). Statistical learning in 8-month-old infants. Science, 274, 1926-1928], there has been a great deal of interest in the question of how statistical regularities in the speech stream might be used by infants to begin to identify individual words. In this work, we use(More)
We present a framework for the rational analysis of elemental causal induction-learning about the existence of a relationship between a single cause and effect-based upon causal graphical models. This framework makes precise the distinction between causal structure and causal strength: the difference between asking whether a causal relationship exists and(More)