Corpus ID: 51841161

Stochastic Gradient VB and the Variational Auto-Encoder

  title={Stochastic Gradient VB and the Variational Auto-Encoder},
  author={Diederik P. Kingma and M. Welling},
How can we perform efficient inference and learning in directed probabilistic models, in the presence of continuous latent variables with intractable posterior distributions, and large datasets? We introduce a stochastic variational inference and learning algorithm that scales to large datasets and, under some mild differentiability conditions, even works in the intractable case. Our contributions is two-fold. First, we show that a reparameterization of the variational lower bound yields a… Expand
Advances in Variational Inference
An overview of recent trends in variational inference is given and a summary of promising future research directions is provided. Expand
Variational Noise-Contrastive Estimation
It is proved that VNCE can be used for both parameter estimation of unnormalised models and posterior inference of latent variables, and has the same level of generality as standard VI, meaning that advances made there can be directly imported to the un normalised setting. Expand
ALMOND: Adaptive Latent Modeling and Optimization via Neural Networks and Langevin Diffusion
Latent variable models cover a broad range of statistical and machine learning models, such as Bayesian models, linear mixed models, and Gaussian mixture models, which help improve the quality of existing methods. Expand
Iterative Amortized Inference
This work proposes iterative inference models, which learn to perform inference optimization through repeatedly encoding gradients, and demonstrates the inference optimization capabilities of these models and shows that they outperform standard inference models on several benchmark data sets of images and text. Expand
Non-conjugate Posterior using Stochastic Gradient Ascent with Adaptive Stepsize
  • Kart-Leong Lim
  • Computer Science
  • 2020 International Joint Conference on Neural Networks (IJCNN)
  • 2020
A novel approach based on the recently proposed constant stepsize stochastic gradient ascent to allow large scale learning on non-conjugate posterior and inspired by SVI and Adam, the novel use of adaptive stepsizes in this method is proposed to significantly improve its learning. Expand
Iterative Inference Models
Inference models, which replace an optimization-based inference procedure with a learned model, have been fundamental in advancing Bayesian deep learning, the most notable example being variationalExpand
Variational posterior approximation using stochastic gradient ascent with adaptive stepsize
This work explores using stochastic gradient ascent as a fast algorithm for the posterior approximation of Dirichlet process mixture and first optimize stepsize using the momentum method, then introduces Fisher information to allow adaptive stepsize in the authors' posterior approximation. Expand
Doubly Stochastic Variational Inference for Neural Processes with Hierarchical Latent Variables
A new variant of NP model that is called Doubly Stochastic Variational Neural Process (DSVNP), which combines the global latent variable and local latent variables for prediction and demonstrates competitive prediction performance in multi-output regression and uncertainty estimation in classification. Expand
Automatic Relevance Determination For Deep Generative Models
A recurring problem when building probabilistic latent variable models is regularization and model selection, for instance, the choice of the dimensionality of the latent space. In the context ofExpand
A Divergence Bound for Hybrids of MCMC and Variational Inference and an Application to Langevin Dynamics and SGVI
A distribution over variational parameters is derived, designed to minimize a bound on the divergence between the resulting marginal distribution and the target, and an example of how to sample from this distribution in a way that interpolates between the behavior of existing methods based on Langevin dynamics and stochastic gradient variational inference (SGVI). Expand


Stochastic Back-propagation and Variational Inference in Deep Latent Gaussian Models
We marry ideas from deep neural networks and approximate Bayesian inference to derive a generalised class of deep, directed generative models, endowed with a new algorithm for scalable inference andExpand
Black Box Variational Inference
This paper presents a "black box" variational inference algorithm, one that can be quickly applied to many models with little additional derivation, based on a stochastic optimization of the variational objective where the noisy gradient is computed from Monte Carlo samples from the Variational distribution. Expand
Deep Generative Stochastic Networks Trainable by Backprop
Theorems that generalize recent work on the probabilistic interpretation of denoising autoencoders are provided and obtain along the way an interesting justification for dependency networks and generalized pseudolikelihood. Expand
Variational Bayesian Inference with Stochastic Search
This work presents an alternative algorithm based on stochastic optimization that allows for direct optimization of the variational lower bound and demonstrates the approach on two non-conjugate models: logistic regression and an approximation to the HDP. Expand
Stochastic variational inference
Stochastic variational inference lets us apply complex Bayesian models to massive data sets, and it is shown that the Bayesian nonparametric topic model outperforms its parametric counterpart. Expand
Adaptive Subgradient Methods for Online Learning and Stochastic Optimization
This work describes and analyze an apparatus for adaptively modifying the proximal function, which significantly simplifies setting a learning rate and results in regret guarantees that are provably as good as the best proximal functions that can be chosen in hindsight. Expand
Fixed-Form Variational Posterior Approximation through Stochastic Linear Regression
A general algorithm for approximating nonstandard Bayesian posterior distributions that minimizes the Kullback-Leibler divergence of an approximating distribution to the intractable posterior distribu- tion. Expand
Deep AutoRegressive Networks
An efficient approximate parameter estimation method based on the minimum description length (MDL) principle is derived, which can be seen as maximising a variational lower bound on the log-likelihood, with a feedforward neural network implementing approximate inference. Expand
Efficient Learning of Deep Boltzmann Machines
We present a new approximate inference algorithm for Deep Boltzmann Machines (DBM’s), a generative model with many layers of hidden variables. The algorithm learns a separate “recognition” model thatExpand
Representation Learning: A Review and New Perspectives
Recent work in the area of unsupervised feature learning and deep learning is reviewed, covering advances in probabilistic models, autoencoders, manifold learning, and deep networks. Expand