• Corpus ID: 235376939

Mixture weights optimisation for Alpha-Divergence Variational Inference

@inproceedings{Daudel2021MixtureWO,
  title={Mixture weights optimisation for Alpha-Divergence Variational Inference},
  author={Kam'elia Daudel and Randal Douc},
  booktitle={NeurIPS},
  year={2021}
}
This paper focuses on α -divergence minimisation methods for Variational Inference. We consider the case where the posterior density is approximated by a mixture model and we investigate algorithms optimising the mixture weights of this mixture model by α -divergence minimisation, without any information on the underlying distribution of its mixture components parameters. The Power Descent, defined for all α ̸ = 1 , is one such algorithm and we establish in our work the full proof of its… 
1 Citations

Figures and Tables from this paper

Variational inference via Wasserstein gradient flows
TLDR
This work proposes principled methods for VI, in which π̂ is taken to be a Gaussian or a mixture of Gaussians, which rest upon the theory of gradient flows on the Bures–Wasserstein space of Gaussian measures.

References

SHOWING 1-10 OF 43 REFERENCES
Infinite-dimensional gradient-based descent for alpha-divergence minimisation
TLDR
The $(\alpha, \Gamma)-descent is introduced, an iterative algorithm which operates on measures and performs $\alpha$-divergence minimisation in a Bayesian framework and it is proved that for a rich family of functions $\Gamma$, this algorithm leads at each step to a systematic decrease in the $\ alpha$-Divergence.
Variational Refinement for Importance Sampling Using the Forward Kullback-Leibler Divergence
TLDR
A novel combination of optimization and sampling techniques for approximate Bayesian inference is proposed by constructing an IS proposal distribution through the minimization of a forward KL (FKL) divergence, which guarantees asymptotic consistency and a fast convergence towards both the optimal IS estimator and the optimal variational approximation.
Variational Inference with Tail-adaptive f-Divergence
TLDR
A new class of tail-adaptive f-divergences that adaptively change the convex function f with the tail of the importance weights, in a way that theoretically guarantee finite moments, while simultaneously achieving mass-covering properties is proposed.
Perturbative Black Box Variational Inference
TLDR
This paper views BBVI with generalized divergences as a form of estimating the marginal likelihood via biased importance sampling, and builds a family of new variational bounds that captures the standard KL bound for $K=1, and converges to the exact marginal likelihood as $K\to\infty$.
Consistency of variational Bayes inference for estimation and model selection in mixtures
TLDR
This work studies the concentration of variational approximations of posteriors, and proves that the approach already used in practice, which consists in maximizing a numerical criterion (the Evidence Lower Bound), leads to strong oracle inequalities.
Monotonic Alpha-divergence Minimisation
TLDR
This paper introduces a novel iterative algorithm which carries out α-divergence minimisation by ensuring a systematic decrease in the α-Divergence at each step, and sheds a new light on an integrated Expectation Maximization algorithm.
Meta-Learning Divergences for Variational Inference
TLDR
This paper proposes a meta-learning algorithm to learn the divergence metric suited for the task of interest, automating the design of VI methods and demonstrates that this approach outperforms standard VI on Gaussian mixture distribution approximation, Bayesian neural network regression, image generation with variational autoencoders and recommender systems with a partial variational Autoencoder.
Rényi Divergence Variational Inference
TLDR
The variational R\'enyi bound (VR) is introduced that extends traditional variational inference to R‐enyi's alpha-divergences, and a novel variational inferred method is proposed as a new special case in the proposed framework.
Empirical Evaluation of Biased Methods for Alpha Divergence Minimization
TLDR
Empirically evaluate biased methods for alpha-divergence minimization and empirically show that weight degeneracy does indeed occur with these estimators in cases where they return highly biased solutions, relating these results to the curse of dimensionality.
On the Difficulty of Unbiased Alpha Divergence Minimization
TLDR
It is found that when alpha is not zero, the SNR worsens exponentially in the dimensionality of the problem, which casts doubt on the practicality of unbiased methods for alpha-divergence minimization through the Signal-to-Noise Ratio of the gradient estimator.
...
...