• Corpus ID: 244130440

Natural Gradient Variational Inference with Gaussian Mixture Models

@article{Mahdisoltani2021NaturalGV,
  title={Natural Gradient Variational Inference with Gaussian Mixture Models},
  author={Farzaneh Mahdisoltani},
  journal={ArXiv},
  year={2021},
  volume={abs/2111.08002}
}
Bayesian methods estimate a measure of uncertainty by using the posterior distribution: p(z|D) = p(D|z)p(z)/p(D). One source of difficulty in these methods is the computation of the normalizing constant p(D) = ∫ p(D|z)p(z)dz. Calculating exact posterior is generally intractable and we usually approximate it. Variational Inference (VI) methods approximate the posterior with a distribution q(z) usually chosen from a simple family using optimization. The main contribution of this work is described… 

Tables from this paper

References

SHOWING 1-8 OF 8 REFERENCES

Black Box Variational Inference

This paper presents a "black box" variational inference algorithm, one that can be quickly applied to many models with little additional derivation, based on a stochastic optimization of the variational objective where the noisy gradient is computed from Monte Carlo samples from the Variational distribution.

The Variational Gaussian Approximation Revisited

The relationship between the Laplace and the variational approximation is discussed, and it is shown that for models with gaussian priors and factorizing likelihoods, the number of variational parameters is actually .

Fast and Simple Natural-Gradient Variational Inference with Mixture of Exponential-family Approximations

The empirical results demonstrate a faster convergence of the natural-gradient method compared to black-box gradient-based methods with reparameterization gradients, which expands the scope of natural gradients for Bayesian inference and makes them more widely applicable than before.

Practical Deep Learning with Bayesian Principles

This work enables practical deep learning while preserving benefits of Bayesian principles, and applies techniques such as batch normalisation, data augmentation, and distributed training to achieve similar performance in about the same number of epochs as the Adam optimiser.

Fast and Scalable Bayesian Deep Learning by Weight-Perturbation in Adam

New natural-gradient algorithms to reduce efforts for Gaussian mean-field VI by perturbing the network weights during gradient evaluations, and uncertainty estimates can be cheaply obtained by using the vector that adapts the learning rate.

The Information Geometry of Mirror Descent

It is proved that mirror descent induced by Bregman divergence proximity functions is equivalent to the natural gradient descent algorithm on the dual Riemannian manifold.

Natural Gradient Works Efficiently in Learning

  • S. Amari
  • Computer Science
    Neural Computation
  • 1998
The dynamical behavior of natural gradient online learning is analyzed and is proved to be Fisher efficient, implying that it has asymptotically the same performance as the optimal batch estimation of parameters.

Fast and simple natural-gradient variational inference with mixture of exponential-family approximations. icml

  • 2019