# Infinite-dimensional gradient-based descent for alpha-divergence minimisation

@article{Daudel2021InfinitedimensionalGD, title={Infinite-dimensional gradient-based descent for alpha-divergence minimisation}, author={Kam'elia Daudel and Randal Douc and Franccois Portier}, journal={The Annals of Statistics}, year={2021} }

This paper introduces the $(\alpha, \Gamma)$-descent, an iterative algorithm which operates on measures and performs $\alpha$-divergence minimisation in a Bayesian framework. This gradient-based procedure extends the commonly-used variational approximation by adding a prior on the variational parameters in the form of a measure. We prove that for a rich family of functions $\Gamma$, this algorithm leads at each step to a systematic decrease in the $\alpha$-divergence. Our framework recovers the…

## 6 Citations

### Mixture weights optimisation for Alpha-Divergence Variational Inference

- Computer ScienceNeurIPS
- 2021

The link between Power Descent and Entropic Mirror Descent is investigated and first-order approximations allow us to introduce the Rényi Descent, a novel algorithm for which the authors prove an O (1 /N ) convergence rate.

### Monotonic Alpha-divergence Minimisation

- Computer Science, BusinessArXiv
- 2021

This paper introduces a novel iterative algorithm which carries out α-divergence minimisation by ensuring a systematic decrease in the α-Divergence at each step, and sheds a new light on an integrated Expectation Maximization algorithm.

### Monotonic Alpha-divergence Minimisation for Variational Inference

- Computer Science
- 2021

A novel family of iterative algorithms which carry out α-divergence minimisation in a Variational Inference context by ensuring a systematic decrease at each step in the α- divergence between the variational and the posterior distributions are introduced.

### Adaptive Importance Sampling meets Mirror Descent : a Bias-variance Tradeoff

- MathematicsAISTATS
- 2022

Adaptive importance sampling is a widely spread Monte Carlo technique that uses a re-weighting strategy to iteratively estimate the so-called target distribution. A major draw-back of adaptive…

### Variational inference via Wasserstein gradient flows

- Computer ScienceArXiv
- 2022

This work proposes principled methods for VI, in which π̂ is taken to be a Gaussian or a mixture of Gaussians, which rest upon the theory of gradient flows on the Bures–Wasserstein space of Gaussian measures.

### A Novel Estimator of Mutual Information for Learning to Disentangle Textual Representations

- Computer ScienceACL
- 2021

A novel variational upper bound to the mutual information between an attribute and the latent code of an encoder is introduced, leading to both better disentangled representations and in particular, a precise control of the desirable degree of disentanglement than state-of-the-art methods proposed for textual data.

## References

SHOWING 1-10 OF 58 REFERENCES

### Safe adaptive importance sampling: A mixture approach

- MathematicsThe Annals of Statistics
- 2020

This paper investigates adaptive importance sampling algorithms for which the policy, the sequence of distributions used to generate the particles, is a mixture distribution between a flexible kernel…

### The $f$-Divergence Expectation Iteration Scheme.

- Computer Science
- 2019

Empirical results support the claim that the novel iterative algorithm which operates on measures and performs $f-divergence minimisation minimisation in a Bayesian framework serves as a powerful tool to assist Variational methods.

### Efficiency versus robustness : the case for minimum Hellinger distance and related methods

- Mathematics
- 1994

It is shown how and why the influence curve poorly measures the robustness properties of minimum Hellinger distance estimation. Rather, for this and related forms of estimation, there is another…

### Markov Processes and the H -Theorem

- Mathematics
- 1963

The H -theorem is investigated in view of Markov processes. The proof is valid even in the fields other than physics, since none of physical relations, such as the principle of microscopic…

### Neue Begründung der Theorie quadratischer Formen von unendlichvielen Veränderlichen.

- Mathematics
- 1909

Die Theorie der Integralgleichungen und die anschließenden neueren Untersuchungen sind von vornherein von dem Bestreben getragen worden, die Sätze der Algebra über lineare Gleichungssysteme und…

### Adaptive importance sampling in monte carlo integration

- Business
- 1992

An Adaptive Importance Sampling (AIS) scheme is introduced to compute integrals of the form as a mechanical, yet flexible, way of dealing with the selection of parameters of the importance function.…

### Information geometric measurements of generalisation

- Computer Science
- 1995

The extension of information divergence to positive normalisable measures reveals a remarkable relation between the dlt dual affine geometry of statistical manifolds and the geometry of the dual pair of Banach spaces Ld and Ldd, which offers conceptual simplification to information geometry.

### A Generalization Bound for Online Variational Inference

- Computer ScienceACML
- 2019

It is shown that this is indeed the case for some variational inference (VI) algorithms, and theoretical justifications in favor of online algorithms relying on approximate Bayesian methods are presented.

### Safe and adaptive importance sampling: a mixture approach

- Mathematics
- 2019

This paper investigates adaptive importance sampling algorithms for which the policy , the sequence of distributions used to generate the particles, is a mixture distribution between a ﬂexible kernel…

### Bayesian estimates of equation system parameters, An application of integration by Monte Carlo

- Economics
- 1976

textabstractMonte Carlo (MC) is used to draw parameter values from a distribution defined on the structural parameter space of an equation system. Making use of the prior density, the likelihood, and…