Likelihood Ratio Exponential Families
@article{Brekelmans2020LikelihoodRE, title={Likelihood Ratio Exponential Families}, author={Rob Brekelmans and Frank Nielsen and Alireza Makhzani and A. G. Galstyan and Greg Ver Steeg}, journal={ArXiv}, year={2020}, volume={abs/2012.15480} }
The exponential family is well known in machine learning and statistical physics as the maximum entropy distribution subject to a set of observed constraints [1], while the geometric mixture path is common in MCMC methods such as annealed importance sampling (AIS) [2, 3]. Linking these two ideas, recent work [4] has interpreted the geometric mixture path as an exponential family of distributions to analyse the thermodynamic variational objective (TVO) [5]. In this work, we extend likelihood…
5 Citations
q-Paths: Generalizing the Geometric Annealing Path using Power Means
- MathematicsUAI
- 2021
This work introduces q-paths, a family of paths which is derived from a generalized notion of the mean, includes the geometric and arithmetic mixtures as special cases, and admits a simple closed form involving the deformed logarithm function from nonextensive thermodynamics.
Revisiting Chernoff Information with Likelihood Ratio Exponential Families
- Computer ScienceEntropy
- 2022
This paper revisits the Chernoff information between two densities of a measurable Lebesgue space by considering the exponential families induced by their geometric mixtures: The so-called likelihood ratio exponential families.
Rho-Tau Bregman Information and the Geometry of Annealing Paths
- MathematicsArXiv
- 2022
Markov Chain Monte Carlo methods for sampling from complex distributions and estimating normalization constants often simulate samples from a sequence of intermediate distributions along an annealing…
On a Variational Definition for the Jensen-Shannon Symmetrization of Distances Based on the Information Radius
- Computer ScienceEntropy
- 2021
We generalize the Jensen-Shannon divergence and the Jensen-Shannon diversity index by considering a variational definition with respect to a generic mean, thereby extending the notion of Sibson’s…
Beyond scalar quasi-arithmetic means: Quasi-arithmetic averages and quasi-arithmetic mixtures in information geometry
- Mathematics, Computer ScienceArXiv
- 2023
It is shown how quasi-arithmetic averages are used to express points on dual geodesics and sided barycenters in the dual affine coordinate systems and describes several parametric and non-parametric statistical models which are closed under the quasi-Arithmetic mixture operation.
References
SHOWING 1-10 OF 38 REFERENCES
All in the Exponential Family: Bregman Duality in Thermodynamic Variational Inference
- Computer ScienceICML
- 2020
An exponential family interpretation of the geometric mixture curve underlying the TVO and various path sampling methods is proposed, which allows the gap in TVO likelihood bounds as a sum of KL divergences and derives a doubly reparameterized gradient estimator which improves model learning and allows the TVo to benefit from more refined bounds.
Graphical Models, Exponential Families, and Variational Inference
- Computer ScienceFound. Trends Mach. Learn.
- 2008
The variational approach provides a complementary alternative to Markov chain Monte Carlo as a general source of approximation methods for inference in large-scale statistical models.
Annealed importance sampling
- MathematicsStat. Comput.
- 2001
It is shown how one can use the Markov chain transitions for such an annealing sequence to define an importance sampler, which can be seen as a generalization of a recently-proposed variant of sequential importance sampling.
Annealing between distributions by averaging moments
- Computer Science, MathematicsNIPS
- 2013
A novel sequence of intermediate distributions for exponential families defined by averaging the moments of the initial and target distributions is presented and an asymptotically optimal piecewise linear schedule is derived.
DEMI: Discriminative Estimator of Mutual Information
- Computer ScienceArXiv
- 2020
It is shown theoretically that the method and other variational approaches are equivalent when they achieve their optimum, while the approach does not optimize a variational bound.
On Variational Bounds of Mutual Information
- Computer ScienceICML
- 2019
This work introduces a continuum of lower bounds that encompasses previous bounds and flexibly trades off bias and variance and demonstrates the effectiveness of these new bounds for estimation and representation learning.
Simulating Normalizing Constants: From Importance Sampling to Bridge Sampling to Path Sampling
- Computer Science
- 1998
It is shown that the acceptance ratio method and thermodynamic integration are natural generalizations of importance sampling, which is most familiar to statistical audiences.
The Information Bottleneck EM Algorithm
- Computer ScienceUAI
- 2003
The resulting, Information Bottleneck Expectation Maximization (IB-EM) algorithm, manages to find solutions that are superior to standard EM methods.
Fixing a Broken ELBO
- Computer ScienceICML
- 2018
This framework derives variational lower and upper bounds on the mutual information between the input and the latent variable, and uses these bounds to derive a rate-distortion curve that characterizes the tradeoff between compression and reconstruction accuracy.
Deterministic annealing for clustering, compression, classification, regression, and related optimization problems
- Computer ScienceProc. IEEE
- 1998
The deterministic annealing approach to clustering and its extensions has demonstrated substantial performance improvement over standard supervised and unsupervised learning methods in a variety of…