Corpus ID: 219636223

Query Training: Learning and inference for directed and undirected graphical models

  title={Query Training: Learning and inference for directed and undirected graphical models},
  author={Miguel L'azaro-Gredilla and Wolfgang Lehrach and Nishad Gothoskar and Guangyao Zhou and Antoine Dedieu and Dileep George},
Probabilistic graphical models (PGMs) provide a compact representation of knowledge that can be queried in a flexible way: after learning the parameters of a graphical model, new probabilistic queries can be answered at test time without retraining. However, learning undirected graphical models is notoriously hard due to the intractability of the partition function. For directed models, a popular approach is to use variational autoencoders, but there is no systematic way to choose the encoder… Expand
1 Citations

Figures and Tables from this paper

From CAPTCHA to Commonsense: How Brain Can Teach Us About Artificial Intelligence
A neuroscience-inspired generative model of vision is presented as a case study for a strategy to gain insights from the brain by simultaneously looking at the world it acts upon and the computational framework to support efficient learning and generalization. Expand


Neural Variational Inference and Learning in Undirected Graphical Models
This work proposes black-box learning and inference algorithms for undirected models that optimize a variational approximation to the log-likelihood of the model via a unified variational inference framework and empirically demonstrates the effectiveness of the method on several popular generative modeling datasets. Expand
Piecewise Training for Undirected Models
This paper shows that this piecewise method can be justified as minimizing a new family of upper bounds on the log partition function, and on three natural-language data sets, piecewise training is more accurate than pseudolikelihood, and often performs comparably to global training using belief propagation. Expand
Adversarial Variational Inference and Learning in Markov Random Fields
The Adversarial Variational Inference and Learning (AVIL) algorithm is proposed to solve the problems with a minimal assumption about the model structure of an MRF with better results than existing competitors on several real datasets. Expand
Auto-Encoding Variational Bayes
A stochastic variational inference and learning algorithm that scales to large datasets and, under some mild differentiability conditions, even works in the intractable case is introduced. Expand
Learning large-scale conditional random fields
The Shotgun algorithm for parallel regression can achieve near-linear speedups, and extensive experiments show it to be one of the fastest methods for sparse regression, a key component of the CRF learning methods. Expand
Variational Inference: A Review for Statisticians
Variational inference (VI), a method from machine learning that approximates probability densities through optimization, is reviewed and a variant that uses stochastic optimization to scale up to massive data is derived. Expand
Piecewise pseudolikelihood for efficient training of conditional random fields
On several benchmark NLP data sets, piecewise pseudolikelihood has better accuracy than standard pseudolikedlihood, and in many cases nearly equivalent to maximum likelihood, with five to ten times less training time than batch CRF training. Expand
Local Training and Belief Propagation
Because maximum-likelihood training is intractable for general factor graphs, an appealing alternative is local training, which approximates the likelihood gradient without performing globalExpand
Stochastic Backpropagation and Approximate Inference in Deep Generative Models
We marry ideas from deep neural networks and approximate Bayesian inference to derive a generalised class of deep, directed generative models, endowed with a new algorithm for scalable inference andExpand
Training Products of Experts by Minimizing Contrastive Divergence
A product of experts (PoE) is an interesting candidate for a perceptual system in which rapid inference is vital and generation is unnecessary because it is hard even to approximate the derivatives of the renormalization term in the combination rule. Expand