# Stochastic Gradient VB and the Variational Auto-Encoder

@inproceedings{Kingma2013StochasticGV, title={Stochastic Gradient VB and the Variational Auto-Encoder}, author={Diederik P. Kingma and M. Welling}, year={2013} }

How can we perform efficient inference and learning in directed probabilistic models, in the presence of continuous latent variables with intractable posterior distributions, and large datasets? We introduce a stochastic variational inference and learning algorithm that scales to large datasets and, under some mild differentiability conditions, even works in the intractable case. Our contributions is two-fold. First, we show that a reparameterization of the variational lower bound yields a… Expand

#### 188 Citations

Advances in Variational Inference

- Computer Science, Mathematics
- IEEE Transactions on Pattern Analysis and Machine Intelligence
- 2019

An overview of recent trends in variational inference is given and a summary of promising future research directions is provided. Expand

Variational Noise-Contrastive Estimation

- Computer Science, Mathematics
- AISTATS
- 2019

It is proved that VNCE can be used for both parameter estimation of unnormalised models and posterior inference of latent variables, and has the same level of generality as standard VI, meaning that advances made there can be directly imported to the un normalised setting. Expand

ALMOND: Adaptive Latent Modeling and Optimization via Neural Networks and Langevin Diffusion

- Computer Science
- 2020

Latent variable models cover a broad range of statistical and machine learning models, such as Bayesian models, linear mixed models, and Gaussian mixture models, which help improve the quality of existing methods. Expand

Iterative Amortized Inference

- Computer Science, Mathematics
- ICML
- 2018

This work proposes iterative inference models, which learn to perform inference optimization through repeatedly encoding gradients, and demonstrates the inference optimization capabilities of these models and shows that they outperform standard inference models on several benchmark data sets of images and text. Expand

Non-conjugate Posterior using Stochastic Gradient Ascent with Adaptive Stepsize

- Computer Science
- 2020 International Joint Conference on Neural Networks (IJCNN)
- 2020

A novel approach based on the recently proposed constant stepsize stochastic gradient ascent to allow large scale learning on non-conjugate posterior and inspired by SVI and Adam, the novel use of adaptive stepsizes in this method is proposed to significantly improve its learning. Expand

Iterative Inference Models

- 2017

Inference models, which replace an optimization-based inference procedure with a learned model, have been fundamental in advancing Bayesian deep learning, the most notable example being variational… Expand

Variational posterior approximation using stochastic gradient ascent with adaptive stepsize

- Computer Science
- Pattern Recognit.
- 2021

This work explores using stochastic gradient ascent as a fast algorithm for the posterior approximation of Dirichlet process mixture and first optimize stepsize using the momentum method, then introduces Fisher information to allow adaptive stepsize in the authors' posterior approximation. Expand

Doubly Stochastic Variational Inference for Neural Processes with Hierarchical Latent Variables

- Computer Science, Mathematics
- ICML
- 2020

A new variant of NP model that is called Doubly Stochastic Variational Neural Process (DSVNP), which combines the global latent variable and local latent variables for prediction and demonstrates competitive prediction performance in multi-output regression and uncertainty estimation in classification. Expand

Automatic Relevance Determination For Deep Generative Models

- Mathematics
- 2015

A recurring problem when building probabilistic latent variable models is regularization and model selection, for instance, the choice of the dimensionality of the latent space. In the context of… Expand

A Divergence Bound for Hybrids of MCMC and Variational Inference and an Application to Langevin Dynamics and SGVI

- Mathematics, Computer Science
- ICML
- 2017

A distribution over variational parameters is derived, designed to minimize a bound on the divergence between the resulting marginal distribution and the target, and an example of how to sample from this distribution in a way that interpolates between the behavior of existing methods based on Langevin dynamics and stochastic gradient variational inference (SGVI). Expand

#### References

SHOWING 1-10 OF 18 REFERENCES

Stochastic Back-propagation and Variational Inference in Deep Latent Gaussian Models

- Mathematics, Computer Science
- ArXiv
- 2014

We marry ideas from deep neural networks and approximate Bayesian inference to derive a generalised class of deep, directed generative models, endowed with a new algorithm for scalable inference and… Expand

Black Box Variational Inference

- Mathematics, Computer Science
- AISTATS
- 2014

This paper presents a "black box" variational inference algorithm, one that can be quickly applied to many models with little additional derivation, based on a stochastic optimization of the variational objective where the noisy gradient is computed from Monte Carlo samples from the Variational distribution. Expand

Deep Generative Stochastic Networks Trainable by Backprop

- Mathematics, Computer Science
- ICML
- 2014

Theorems that generalize recent work on the probabilistic interpretation of denoising autoencoders are provided and obtain along the way an interesting justification for dependency networks and generalized pseudolikelihood. Expand

Variational Bayesian Inference with Stochastic Search

- Computer Science, Mathematics
- ICML
- 2012

This work presents an alternative algorithm based on stochastic optimization that allows for direct optimization of the variational lower bound and demonstrates the approach on two non-conjugate models: logistic regression and an approximation to the HDP. Expand

Stochastic variational inference

- Computer Science, Mathematics
- J. Mach. Learn. Res.
- 2013

Stochastic variational inference lets us apply complex Bayesian models to massive data sets, and it is shown that the Bayesian nonparametric topic model outperforms its parametric counterpart. Expand

Adaptive Subgradient Methods for Online Learning and Stochastic Optimization

- Computer Science, Mathematics
- J. Mach. Learn. Res.
- 2011

This work describes and analyze an apparatus for adaptively modifying the proximal function, which significantly simplifies setting a learning rate and results in regret guarantees that are provably as good as the best proximal functions that can be chosen in hindsight. Expand

Fixed-Form Variational Posterior Approximation through Stochastic Linear Regression

- Mathematics, Computer Science
- ArXiv
- 2012

A general algorithm for approximating nonstandard Bayesian posterior distributions that minimizes the Kullback-Leibler divergence of an approximating distribution to the intractable posterior distribu- tion. Expand

Deep AutoRegressive Networks

- Computer Science, Mathematics
- ICML
- 2014

An efficient approximate parameter estimation method based on the minimum description length (MDL) principle is derived, which can be seen as maximising a variational lower bound on the log-likelihood, with a feedforward neural network implementing approximate inference. Expand

Efficient Learning of Deep Boltzmann Machines

- Mathematics, Computer Science
- AISTATS
- 2010

We present a new approximate inference algorithm for Deep Boltzmann Machines (DBM’s), a generative model with many layers of hidden variables. The algorithm learns a separate “recognition” model that… Expand

Representation Learning: A Review and New Perspectives

- Computer Science, Mathematics
- IEEE Transactions on Pattern Analysis and Machine Intelligence
- 2013

Recent work in the area of unsupervised feature learning and deep learning is reviewed, covering advances in probabilistic models, autoencoders, manifold learning, and deep networks. Expand