• Corpus ID: 15916415

Optimally approximating exponential families

  title={Optimally approximating exponential families},
  author={Johannes Rauh},
  • J. Rauh
  • Published 28 October 2011
  • Mathematics, Computer Science
  • Kybernetika
This article studies exponential families $\mathcal{E}$ on finite sets such that the information divergence $D(P\|\mathcal{E})$ of an arbitrary probability distribution from $\mathcal{E}$ is bounded by some constant $D>0$. A particular class of low-dimensional exponential families that have low values of $D$ can be obtained from partitions of the state space. The main results concern optimality properties of these partition exponential families. Exponential families where $D=\log(2)$ are… 
Scaling of model approximation errors and expected entropy distances
We compute the expected value of the Kullback-Leibler divergence to various fundamental statistical models with respect to canonical priors on the probability simplex. We obtain closed formulas for
Universal Approximation Depth and Errors of Narrow Belief Networks with Discrete Units
This analysis covers discrete restricted Boltzmann machines and naive Bayes models as special cases and shows that a q-ary deep belief network with layers of width for some can approximate any probability distribution on without exceeding a Kullback-Leibler divergence.
Maximal Information Divergence from Statistical Models Defined by Neural Networks
We review recent results about the maximal values of the Kullback-Leibler information divergence from statistical models defined by neural networks, including naive Bayes models, restricted Boltzmann
Restricted Boltzmann Machines: Introduction and Review
An introduction to the mathematical analysis of restricted Boltzmann machines is given, recent results on the geometry of the sets of probability distributions representable by these models are reviewed, and a few directions for further investigation are suggested.


Finding the Maximizers of the Information Divergence From an Exponential Family
  • J. Rauh
  • Computer Science
    IEEE Transactions on Information Theory
  • 2011
It is shown that the rI -projection of a maximizer P to ε is a convex combination of P and a probability measure P- with disjoint support and the same value of the sufficient statistics <i>A</i>.
On maximization of the information divergence from an exponential family
The information divergence of a probability measure P from an exponential family E over a nite set is deened as innmum of the divergences of P from Q subject to Q in E. For convex exponential
Maximization of the information divergence from an exponential family and criticality
  • F. Matús, J. Rauh
  • Mathematics
    2011 IEEE International Symposium on Information Theory Proceedings
  • 2011
The problem to maximize the information divergence from an exponential family is compared to the maximization of an entropy-like quantity over the boundary of a polytope. First-order conditions on
Information Theory and Statistics: A Tutorial
This tutorial is concerned with applications of information theory concepts in statistics, in the finite alphabet setting, and an introduction is provided to the theory of universal coding, and to statistical inference via the minimum description length principle motivated by that theory.
Inducing Features of Random Fields
The random field models and techniques introduced in this paper differ from those common to much of the computer vision literature in that the underlying random fields are non-Markovian and have a large number of parameters that must be estimated.
On the toric algebra of graphical models
We formulate necessary and sufficient conditions for an arbitrary discrete probability distribution to factor according to an undirected graphical model, or a log-linear model, or other more general
  • N. Ay
  • Computer Science
  • 2002
T theoretical results about the low complexity of optimal solutions for the optimization problem of frequently used measures like the mutual information in an unconstrained and more theoretical setting are established.
Matroid theory
The current status has been given for all the unsolved problems or conjectures that appear in Chapter 14 and the corrected text is given with the inserted words underlined.
Minimax Entropy Principle and Its Application to Texture Modeling
The minimax entropy principle is applied to texture modeling, where a novel Markov random field model, called FRAME, is derived, and encouraging results are obtained in experiments on a variety of texture images.