Equations of States in Singular Statistical Estimation

@article{Watanabe2010EquationsOS,
  title={Equations of States in Singular Statistical Estimation},
  author={Sumio Watanabe},
  journal={Neural networks : the official journal of the International Neural Network Society},
  year={2010},
  volume={23 1},
  pages={
          20-34
        }
}
  • Sumio Watanabe
  • Published 4 December 2007
  • Mathematics
  • Neural networks : the official journal of the International Neural Network Society

Tables from this paper

A formula of equations of states in singular learning machines
  • Sumio Watanabe
  • Mathematics
    2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence)
  • 2008
TLDR
A formula of equations of states is established which holds among Bayes and Gibbs generalization and training errors, and it is shown that two generalization errors can be estimated from two training errors.
Equations of States in Statistical Learning for a Nonparametrizable and Regular Case
TLDR
It is proved that the same equations hold even if a true distribution is not contained in a parametric model, and the proposed equations in a regular case are asymptotically equivalent to the Takeuchi information criterion.
Asymptotic Equivalence of Bayes Cross Validation and Widely Applicable Information Criterion in Singular Learning Theory
TLDR
The Bayes cross-validation loss is asymptotically equivalent to the widely applicable information criterion as a random variable and model selection and hyperparameter optimization using these two values are asymPTOTically equivalent.
Statistical Learning Theory of Quasi-Regular Cases
TLDR
It is proved that, in a quasi-regular case, two birational invariants are equal to each other, resulting that the symmetry of the generalization and training errors holds and the quasi- regular case is useful to study statistical learning theory.
An Introduction to Algebraic Geometry and Statistical Learning Theory
TLDR
In this book, an algebraic geometrical method is established on which the conventional statistical theory of regular statistical models does not hold, and it is theoretically shown that, in singular models, Bayes estimation is more appropriate than one point estimation, even asymptotically.
Accuracy of latent-variable estimation in Bayesian semi-supervised learning
Learning Coefficient of Generalization Error in Bayesian Estimation and Vandermonde Matrix-Type Singularity
TLDR
This letter gives tight new bound values of learning coefficients for Vandermonde matrix-type singularities and the explicit values with certain conditions, which can show the learning coefficients of three-layered neural networks and normal mixture models.
Asymptotic accuracy of distribution-based estimation of latent variables
TLDR
The present paper formulates distribution-based functions for the errors in the estimation of the latent variables of hierarchical statistical models and analyzes the asymptotic behavior for both the maximum likelihood and the Bayes methods.
Asymptotic Learning Curve and Renormalizable Condition in Statistical Learning Theory
TLDR
This paper defines a renormalizable condition of the statistical estimation problem, and shows that, under such a condition, the asymptotic learning curves are ensured to be subject to the universal law, even if the true distribution is unrealizable and singular for a statistical model.
A Limit Theorem in Singular Regression Problem
TLDR
A limit theorem is proved which shows the relation between the singular regression problem and two birational invariants, a real log canonical threshold and a singular fluctuation and enables us to estimate the generalization error from the training error without any knowledge of the true probability distribution.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 34 REFERENCES
Learning Coefficients of Layered Models When the True Distribution Mismatches the Singularities
TLDR
The Bayes generalization error is studied under the condition that the Kullback distance of the true distribution from the distribution represented by singularities is in proportion to 1/n and two results are shown.
Singularities in mixture models and upper bounds of stochastic complexity
Learning efficiency of redundant neural networks in Bayesian estimation
  • Sumio Watanabe
  • Computer Science, Mathematics
    IEEE Trans. Neural Networks
  • 2001
This paper proves that the Bayesian stochastic complexity of a layered neural network is asymptotically smaller than that of a regular statistical model if it contains the true distribution. We
Algebraic geometrical methods for hierarchical learning machines
Exchange Monte Carlo Sampling From Bayesian Posterior for Singular Learning Machines
TLDR
The idea that the exchange MC method has a better effect on Bayesian learning in singular learning machines than that in regular learning machines is proposed, and its effectiveness is shown by comparing the numerical stochastic complexity with the theoretical one.
Algebraic Analysis for Singular Statistical Estimation
This paper clarifies learning efficiency of a non-regular parametric model such as a neural network whose true parameter set is an analytic variety with singular points. By using Sato's b-function we
Algebraic geometry of singular learning machines and symmetry of generalization and training errors
Algebraic Analysis for Nonidentifiable Learning Machines
TLDR
It is rigorously proved that the Bayesian stochastic complexity or the free energy is asymptotically equal to 1 logn (m1 1) loglogn + constant, where n is the number of training samples and 1 and m1 are the rational number and the natural number, which are determined as the birational invariant values of the singularities in the parameter space.
Algebraic Information Geometry for Learning Machines with Singularities
TLDR
The rigorous asymptotic form of the stochastic complexity is clarified based on resolution of singularities and two different problems are studied.
On the Problem in Model Selection of Neural Network Regression in Overrealizable Scenario
TLDR
The article analyzes the expected training error and the expected generalization error of neural networks and radial basis functions in overrealizable cases and clarifies the difference from regular models, for which identifiability holds.
...
1
2
3
4
...