Equations of States in Statistical Learning for a Nonparametrizable and Regular Case

@article{Watanabe2009EquationsOS,
  title={Equations of States in Statistical Learning for a Nonparametrizable and Regular Case},
  author={Sumio Watanabe},
  journal={ArXiv},
  year={2009},
  volume={abs/0906.0211}
}
Many learning machines that have hierarchical structure or hidden variables are now being used in information science, artificial intelligence, and bioinformatics. However, several learning machines used in such fields are not regular but singular statistical models, hence their generalization performance is still left unknown. To overcome these problems, in the previous papers, we proved new equations in statistical learning, by which we can estimate the Bayes generalization loss from the… 
Asymptotic Learning Curve and Renormalizable Condition in Statistical Learning Theory
TLDR
This paper defines a renormalizable condition of the statistical estimation problem, and shows that, under such a condition, the asymptotic learning curves are ensured to be subject to the universal law, even if the true distribution is unrealizable and singular for a statistical model.
Conditional vs marginal estimation of the predictive loss of hierarchical models using WAIC and cross-validation
TLDR
It is shown that conditional-level WAIC does not provide a reliable estimator of its target loss, and simulations show that it can favour the incorrect model, so it is recommended that WAIC and ISCVL be evaluated using the marginalized likelihood where practicable.
Approximating cross-validatory predictive evaluation in Bayesian latent variable models with integrated IS and WAIC
TLDR
iIS and iWAIC aim at improving the approximations given by importance sampling and WAIC in Bayesian models with possibly correlated latent variables by integrating the predictive density over the distribution of the latent variables associated with the held-out without reference to its observation.

References

SHOWING 1-10 OF 29 REFERENCES
A formula of equations of states in singular learning machines
  • Sumio Watanabe
  • Mathematics
    2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence)
  • 2008
TLDR
A formula of equations of states is established which holds among Bayes and Gibbs generalization and training errors, and it is shown that two generalization errors can be estimated from two training errors.
Learning Coefficients of Layered Models When the True Distribution Mismatches the Singularities
TLDR
The Bayes generalization error is studied under the condition that the Kullback distance of the true distribution from the distribution represented by singularities is in proportion to 1/n and two results are shown.
Algebraic Analysis for Nonidentifiable Learning Machines
TLDR
It is rigorously proved that the Bayesian stochastic complexity or the free energy is asymptotically equal to 1 logn (m1 1) loglogn + constant, where n is the number of training samples and 1 and m1 are the rational number and the natural number, which are determined as the birational invariant values of the singularities in the parameter space.
Algebraic Analysis for Singular Statistical Estimation
This paper clarifies learning efficiency of a non-regular parametric model such as a neural network whose true parameter set is an analytic variety with singular points. By using Sato's b-function we
Singularities Affect Dynamics of Learning in Neuromanifolds
TLDR
An overview of the phenomena caused by the singularities of statistical manifolds related to multilayer perceptrons and gaussian mixtures is given and the natural gradient method is shown to perform well because it takes the singular geometrical structure into account.
Learning efficiency of redundant neural networks in Bayesian estimation
  • Sumio Watanabe
  • Computer Science, Mathematics
    IEEE Trans. Neural Networks
  • 2001
This paper proves that the Bayesian stochastic complexity of a layered neural network is asymptotically smaller than that of a regular statistical model if it contains the true distribution. We
...
1
2
3
...