# Infinite-dimensional gradient-based descent for alpha-divergence minimisation

@article{Daudel2021InfinitedimensionalGD,
title={Infinite-dimensional gradient-based descent for alpha-divergence minimisation},
author={Kam'elia Daudel and Randal Douc and Franccois Portier},
journal={The Annals of Statistics},
year={2021}
}
• Published 20 May 2020
• Computer Science
• The Annals of Statistics
This paper introduces the $(\alpha, \Gamma)$-descent, an iterative algorithm which operates on measures and performs $\alpha$-divergence minimisation in a Bayesian framework. This gradient-based procedure extends the commonly-used variational approximation by adding a prior on the variational parameters in the form of a measure. We prove that for a rich family of functions $\Gamma$, this algorithm leads at each step to a systematic decrease in the $\alpha$-divergence. Our framework recovers the…
6 Citations

## Figures and Tables from this paper

### Mixture weights optimisation for Alpha-Divergence Variational Inference

• Computer Science
NeurIPS
• 2021
The link between Power Descent and Entropic Mirror Descent is investigated and first-order approximations allow us to introduce the Rényi Descent, a novel algorithm for which the authors prove an O (1 /N ) convergence rate.

### Monotonic Alpha-divergence Minimisation

ArXiv
• 2021
This paper introduces a novel iterative algorithm which carries out α-divergence minimisation by ensuring a systematic decrease in the α-Divergence at each step, and sheds a new light on an integrated Expectation Maximization algorithm.

### Monotonic Alpha-divergence Minimisation for Variational Inference

• Computer Science
• 2021
A novel family of iterative algorithms which carry out α-divergence minimisation in a Variational Inference context by ensuring a systematic decrease at each step in the α- divergence between the variational and the posterior distributions are introduced.

• Mathematics
AISTATS
• 2022
Adaptive importance sampling is a widely spread Monte Carlo technique that uses a re-weighting strategy to iteratively estimate the so-called target distribution. A major draw-back of adaptive

### Variational inference via Wasserstein gradient flows

• Computer Science
ArXiv
• 2022
This work proposes principled methods for VI, in which π̂ is taken to be a Gaussian or a mixture of Gaussians, which rest upon the theory of gradient flows on the Bures–Wasserstein space of Gaussian measures.

### A Novel Estimator of Mutual Information for Learning to Disentangle Textual Representations

• Computer Science
ACL
• 2021
A novel variational upper bound to the mutual information between an attribute and the latent code of an encoder is introduced, leading to both better disentangled representations and in particular, a precise control of the desirable degree of disentanglement than state-of-the-art methods proposed for textual data.

## References

SHOWING 1-10 OF 58 REFERENCES

### Safe adaptive importance sampling: A mixture approach

• Mathematics
The Annals of Statistics
• 2020
This paper investigates adaptive importance sampling algorithms for which the policy, the sequence of distributions used to generate the particles, is a mixture distribution between a flexible kernel

### The $f$-Divergence Expectation Iteration Scheme.

• Computer Science
• 2019
Empirical results support the claim that the novel iterative algorithm which operates on measures and performs \$f-divergence minimisation minimisation in a Bayesian framework serves as a powerful tool to assist Variational methods.

### Efficiency versus robustness : the case for minimum Hellinger distance and related methods

It is shown how and why the influence curve poorly measures the robustness properties of minimum Hellinger distance estimation. Rather, for this and related forms of estimation, there is another

### Markov Processes and the H -Theorem

The H -theorem is investigated in view of Markov processes. The proof is valid even in the fields other than physics, since none of physical relations, such as the principle of microscopic

### Neue Begründung der Theorie quadratischer Formen von unendlichvielen Veränderlichen.

Die Theorie der Integralgleichungen und die anschließenden neueren Untersuchungen sind von vornherein von dem Bestreben getragen worden, die Sätze der Algebra über lineare Gleichungssysteme und

### Adaptive importance sampling in monte carlo integration

• 1992
An Adaptive Importance Sampling (AIS) scheme is introduced to compute integrals of the form as a mechanical, yet flexible, way of dealing with the selection of parameters of the importance function.

### Information geometric measurements of generalisation

• Computer Science
• 1995
The extension of information divergence to positive normalisable measures reveals a remarkable relation between the dlt dual affine geometry of statistical manifolds and the geometry of the dual pair of Banach spaces Ld and Ldd, which offers conceptual simplification to information geometry.

### A Generalization Bound for Online Variational Inference

• Computer Science
ACML
• 2019
It is shown that this is indeed the case for some variational inference (VI) algorithms, and theoretical justifications in favor of online algorithms relying on approximate Bayesian methods are presented.

### Safe and adaptive importance sampling: a mixture approach

• Mathematics
• 2019
This paper investigates adaptive importance sampling algorithms for which the policy , the sequence of distributions used to generate the particles, is a mixture distribution between a ﬂexible kernel

### Bayesian estimates of equation system parameters, An application of integration by Monte Carlo

• Economics
• 1976
textabstractMonte Carlo (MC) is used to draw parameter values from a distribution defined on the structural parameter space of an equation system. Making use of the prior density, the likelihood, and