# Field Theoretical Analysis of On-line Learning of Probability Distributions

@article{Aida1999FieldTA, title={Field Theoretical Analysis of On-line Learning of Probability Distributions}, author={Toshiaki Aida}, journal={Physical Review Letters}, year={1999}, volume={83}, pages={3554-3557} }

On-line learning of probability distributions is analyzed from the field theoretical point of view. We can obtain an optimal on-line learning algorithm, since renormalization group enables us to control the number of degrees of freedom of a system according to the number of examples. We do not learn parameters of a model, but probability distributions themselves. Therefore, the algorithm requires no a priori knowledge of a model.

## 12 Citations

### Adaptive on-line learning of probability distributions from field theories

- Computer ScienceProceedings 1999 International Conference on Information Intelligence and Systems (Cat. No.PR00446)
- 1999

An adaptive algorithm is considered in on-line learning of probability functions, which infers a distribution underlying observed data that requires no a priori knowledge of a model.

### Recognition and geometrical on-line learning algorithm of probability distributions

- Computer ScienceProceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium
- 2000

An online learning algorithm for probability distributions is constructed in a reparameterization invariant form and can be optimal, since conformal gauge reduces the problem to a noncovariant case.

### Information theory and learning: a physical approach

- Computer ScienceArXiv
- 2000

It is proved that predictive information provides the unique measure for the complexity of dynamics underlying the time series and there are classes of models characterized by {\em power-law growth of the predictive information} that are qualitatively more complex than any of the systems that have been investigated before.

### Drift estimation from a simple field theory

- Mathematics
- 2008

Given the outcome of a Wiener process, what can be said about the drift and diffusion coefficients? If the process is stationary, these coefficients are related to the mean and variance of the…

### Scaling of a length scale for regression and prediction

- PhysicsProceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing
- 2002

A model with a length scale to smooth the data is constructed, which decreases an uncertain region near a boundary as the speed of the variation of original signals increases, which is a crucial property for accurate prediction.

### Predictability, Complexity, and Learning

- Computer ScienceNeural Computation
- 2001

It is argued that the divergent part of Ipred(T) provides the unique measure for the complexity of dynamics underlying a time series.

### Bayesian field theory: nonparametric approaches to density estimation

- Computer ScienceProceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium
- 2000

Nonparametric approaches to density estimation are discussed from a Bayesian perspective and a numerical example shows that this can be computationally feasible for low-dimensional problems.

### Can Gaussian Process Regression Be Made Robust Against Model Mismatch?

- Computer ScienceDeterministic and Statistical Methods in Machine Learning
- 2004

In lower-dimensional learning scenarios, the theory predicts—in excellent qualitative and good quantitative accord with simulations—that evidence maximization eliminates logarithmically slow learning and recovers the optimal scaling of the decrease of generalization error with training set size.

### How is sensory information processed?

- Computer ScienceArXiv
- 2004

This work analyzes how abstract Bayesian learners would perform on different data and discusses possible experiments that can determine which learning–theoretic computation is performed by a particular organism.

### Detecting joint tendencies of multiple time series

- Mathematics
- 2009

The moving average smoother decomposes time‐series data x(t) into a systematic part plus fluctuations, i.e., x(t) = x(t)+δx(t). In the language of Bayesian inference, smoothing can be understood as…

## References

SHOWING 1-5 OF 5 REFERENCES

### Phys

- Rev. Lett. 77, 4671
- 1996

### Phys

- Rev. Lett. 75, 1415 (1995); J.W. Kim and H. Sompolinsky, Phys. Rev. Lett. 76, 3021 (1996); M. Biehl, P. Riegler and M. Stechert, Phys. Rev. E 52, 4624
- 1995

### Phys

- Rev. A 45, 6056 (1992); S. Amari and N. Murata, Neural Comput. 5, 140 (1993); M. Opper and D. Haussler, Phys. Rev. Lett. 75, 3772
- 1995