# Rényi Divergence and Kullback-Leibler Divergence

@article{Erven2014RnyiDA, title={R{\'e}nyi Divergence and Kullback-Leibler Divergence}, author={Tim van Erven and Peter Harremo{\"e}s}, journal={IEEE Transactions on Information Theory}, year={2014}, volume={60}, pages={3797-3820} }

Rényi divergence is related to Rényi entropy much like Kullback-Leibler divergence is related to Shannon's entropy, and comes up in many settings. It was introduced by Rényi as a measure of information that satisfies almost the same axioms as Kullback-Leibler divergence, and depends on a parameter that is called its order. In particular, the Rényi divergence of order 1 equals the Kullback-Leibler divergence. We review and extend the most important properties of Rényi divergence and…

## 795 Citations

Rényi Divergence to Compare Moving-Average Processes

- Computer Science2018 IEEE Statistical Signal Processing Workshop (SSP)
- 2018

The purpose is to derive the expression of the Rényi divergence between the probability density functions of k consecutive samples of two real first-order moving average (MA) processes by using the eigen-decompositions of their Toeplitz correlation matrices.

Monotonically Decreasing Sequence of Divergences

- Computer Science, MathematicsArXiv
- 2019

New properties of the convex divergences are shown by using integral and differential operators that are introduced and derived that include the Kullback-Leibler divergence or the reverse Kull back-LeIBler divergence from these properties.

Conditional Rényi Divergence Saddlepoint and the Maximization of α-Mutual Information

- Computer ScienceEntropy
- 2019

This paper extends the major analytical results on the saddle-point and saddle-level of the conditional relative entropy to the conditional Rényi divergence and lends further evidence to the notion that a Bayesian measure of statistical distinctness introduced by R. Sibson in 1969 is the most natural generalization.

Conditional Rényi Divergences and Horse Betting

- Computer ScienceEntropy
- 2020

A universal strategy for independent and identically distributed races is presented that—without knowing the winning probabilities or the parameter of the utility function—asymptotically maximizes the gambler’s utility function.

Convexity/concavity of renyi entropy and α-mutual information

- Computer Science2015 IEEE International Symposium on Information Theory (ISIT)
- 2015

This paper shows the counterpart of this result for the Rényi entropy and the Tsallis entropy, and considers a notion of generalized mutual information, namely α-mutual information, which is defined through the Re⩽i divergence.

Logarithmic divergences from optimal transport and Rényi geometry

- Mathematics, Computer ScienceInformation Geometry
- 2018

It is shown that if a statistical manifold is dually projectively flat with constant curvature, then it is locally induced by an L(∓α)-divergence, and a generalized Pythagorean theorem holds true.

Cramér-Rao Lower Bounds Arising from Generalized Csiszár Divergences

- Mathematics, Computer ScienceArXiv
- 2020

Eguchi’s theory is applied to derive the Fisher information metric and the dual affine connections arising from these generalized divergence functions to arrive at a more widely applicable version of the Cramer–Rao inequality, which provides a lower bound for the variance of an estimator for an escort of the underlying parametric probability distribution.

On Rényi Information Measures and Their Applications

- Computer Science
- 2020

The contributions of this thesis are new problems related to guessing, task encoding, hypothesis testing, and horse betting are solved; and two new Rényi measures of dependence and a new conditional RényI divergence appearing in these problems are analyzed.

Zipf–Mandelbrot law, f-divergences and the Jensen-type interpolating inequalities

- Computer ScienceJournal of inequalities and applications
- 2018

This paper integrates the method of interpolating inequalities that makes use of the improved Jensen-type inequalities with the well known Zipf–Mandelbrot law applied to various types of f-divergences and distances, such as Kullback–Leibler divergence, Hellinger distance, Bhattacharyya distance, χ2$\chi^{2}$-Divergence, total variation distance and triangular discrimination.

Robust Kullback-Leibler Divergence and Universal Hypothesis Testing for Continuous Distributions

- Computer ScienceIEEE Transactions on Information Theory
- 2019

A robust version of the classical KL divergence, defined as the KL divergence from a distribution to the Lévy ball of a known distribution, is shown to be continuous in the underlying distribution function with respect to the weak convergence.

## References

SHOWING 1-10 OF 72 REFERENCES

Rényi divergence and majorization

- Computer Science2010 IEEE International Symposium on Information Theory
- 2010

It is shown how Rényi divergence appears when the theory of majorization is generalized from the finite to the continuous setting, and plays a role in analyzing the number of binary questions required to guess the values of a sequence of random variables.

Rényi divergence measures for commonly used univariate continuous distributions

- Computer ScienceInf. Sci.
- 2013

On Rényi Divergence Measures for Continuous Alphabet Sources

- Computer Science
- 2011

The present thesis establishes a connection between Rényi divergence and the variance of the log-likelihood ratio of two distributions, which extends the work of Song [57] on the relation between Réneyi entropy and the log likelihood function, and which becomes practically useful in light of the Rényu divergence expressions derived.

On Divergences and Informations in Statistics and Information Theory

- Computer ScienceIEEE Transactions on Information Theory
- 2006

The paper deals with the f-divergences of Csiszar generalizing the discrimination information of Kullback, the total variation distance, the Hellinger divergence, and the Pearson divergence. All…

Rényi's divergence and entropy rates for finite alphabet Markov sources

- Computer ScienceIEEE Trans. Inf. Theory
- 2001

In this work, we examine the existence and the computation of the Renyi divergence rate, lim/sub n/spl rarr//spl infin// 1/n D/sub /spl alpha//(p/sup (n)//spl par/q/sup (n)/), between two…

Renyi's entropy and the probability of error

- Computer ScienceIEEE Trans. Inf. Theory
- 1978

It is proved that for the two-class case, the I_{2} bound is sharper than many of the previously known bounds.

Alpha-Divergence for Classification, Indexing and Retrieval (Revised 2)

- Computer Science
- 2002

The alpha-divergence measure and a surrogate, the alpha-Jensen difference, are proposed and two methods of alpha-entropy estimation are investigated: indirect methods based on parametric or non-parametric density estimation over feature space; and direct methodsbased on combinatorial optimization of minimal spanning trees or other continuous quasi-additive graphs overfeature space.

$I$-Divergence Geometry of Probability Distributions and Minimization Problems

- Mathematics
- 1975

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and…

From ɛ-entropy to KL-entropy: Analysis of minimum information complexity density estimation

- Computer Science, Mathematics
- 2006

A general information-theoretical inequality is developed that measures the statistical complexity of some deterministic and randomized density estimators and can lead to improvements of some classical results concerning the convergence of minimum description length and Bayesian posterior distributions.

Multiple Source Adaptation and the Rényi Divergence

- Computer ScienceUAI
- 2009

This paper presents a novel theoretical study of the general problem of multiple source adaptation using the notion of Renyi divergence, extending previous multiple source loss guarantees based on distribution weighted combinations to arbitrary target distributions P, not necessarily mixtures of the source distributions, and proving a lower bound.