• Corpus ID: 73728535

Bayesian Allocation Model: Inference by Sequential Monte Carlo for Nonnegative Tensor Factorizations and Topic Models using Polya Urns

  title={Bayesian Allocation Model: Inference by Sequential Monte Carlo for Nonnegative Tensor Factorizations and Topic Models using Polya Urns},
  author={Ali Taylan Cemgil and M. Burak Kurutmaz and Sinan Yıldırım and Melih Barsbey and Umut Simsekli},
We introduce a dynamic generative model, Bayesian allocation model (BAM), which establishes explicit connections between nonnegative tensor factorization (NTF), graphical models of discrete probability distributions and their Bayesian extensions, and the topic models such as the latent Dirichlet allocation. BAM is based on a Poisson process, whose events are marked by using a Bayesian network, where the conditional probability tables of this network are then integrated out analytically. We show… 
Bayesian Allocation Model: Marginal Likelihood-Based Model Selection for Count Tensors
A novel sequential Monte Carlo (SMC) algorithm for marginal likelihood estimation in BAM is developed, leading to a unified scoring method for discrete variable Bayesian networks with hidden nodes, including various NTF and topic models.
Model selection for relational data factorization
This work proposes to estimate model order for mixed membership blockmodels (MMSB) within the generic allocation framework of Bayesian allocation model (BAM), describes how relational data is represented as Poisson counts of the allocation model, and demonstrates the results both on synthetic and real-world data sets.
Allocative Poisson Factorization for Computational Social Science


Variational Bayesian learning of directed graphical models with hidden variables
It is proved that a VB approximation can always be constructed in such a way that guarantees it to be more accurate than the CS approximation, and also to a sampling based gold standard, Annealed Importance Sampling (AIS).
A hierarchical model for ordinal matrix factorization
The model is evaluated on a collaborative filtering task, where users have rated a collection of movies and the system is asked to predict their ratings for other movies, and shows that the suggested model outperforms alternative factorization techniques.
Scalable Recommendation with Poisson Factorization
A variational inference algorithm for approximate posterior inference that scales up to massive data sets and is an efficient algorithm that iterates over the observed entries and adjusts an approximate posterior over the user/item representations.
Sparse Partially Collapsed MCMC for Parallel Inference in Topic Models
A parallel sparse partially collapsed Gibbs sampler is proposed and compared and it is proved that the partially collapsed samplers scale well with the size of the corpus and can be used in more modeling situations than the ordinary collapsed sampler.
Tensor decompositions for learning latent variable models
A detailed analysis of a robust tensor power method is provided, establishing an analogue of Wedin's perturbation theorem for the singular vectors of matrices, and implies a robust and computationally tractable estimation approach for several popular latent variable models.
Pólya Urn Latent Dirichlet Allocation: A Doubly Sparse Massively Parallel Sampler
A novel sampler is presented that is faster, both empirically and theoretically, than previous Gibbs samplers for LDA and it is proved that the approximation error vanishes with data size, making the algorithm asymptotically exact.
Parameter Priors for Directed Acyclic Graphical Models and the Characteriration of Several Probability Distributions
We show that the only parameter prior for complete Gaussian DAG models that satisfies global parameter independence, complete model equivalence, and some weak regularity assumptions, is the
Sum Conditioned Poisson Factorization
A family of fully conjugate tensor decomposition models for binary, ordinal or multinomial data is devised as a result, which can be used as a generic building block in hierarchical models for arrays of such data.
Being Bayesian About Network Structure. A Bayesian Approach to Structure Discovery in Bayesian Networks
This paper shows how to efficiently compute a sum over the exponential number of networks that are consistent with a fixed order over network variables, and uses this result as the basis for an algorithm that approximates the Bayesian posterior of a feature.
Pachinko allocation: DAG-structured mixture models of topic correlations
Improved performance of PAM is shown in document classification, likelihood of held-out data, the ability to support finer-grained topics, and topical keyword coherence.