# Learning about the Parameter of the Bernoulli Model

@article{Vovk1997LearningAT, title={Learning about the Parameter of the Bernoulli Model}, author={Vladimir Vovk}, journal={J. Comput. Syst. Sci.}, year={1997}, volume={55}, pages={96-104} }

We consider the problem of learning as much information as possible about the parameter?of the Bernoulli model {P????0, 1} from the statistical datax?{0, 1}n,n?1 being the sample size. Explicating this problem in terms of the Kolmogorov complexity and Rissanen's minimum description length principle, we construct a computable point estimator which (a) extracts from x all information it contains about?, and (b) discards all sample noise inx. Our result is closely connected with Rissanen's theorem…

## 15 Citations

### On the Convergence Speed of MDL Predictions for Bernoulli Sequences

- Computer Science, MathematicsALT
- 2004

A new upper bound on the prediction error for countable Bernoulli classes is derived, which implies a small bound (comparable to the one for Bayes mixtures) for certain important model classes.

### Bayes code with singular prior

- Computer Science
- 2003

The code shown in this paper has a shorter code length than MDL coding (log coefficient less than k/2) when the parameters are in a set of measure 0, and is considered to be intermediate between Shannon coding andMDL coding.

### Kolmogorov's structure functions with an application to the foundations of model selection

- Computer Science, MathematicsThe 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings.
- 2002

Kolmogorov (1974) proposed a non-probabilistic approach to statistics, an individual combinatorial relation between the data and its model. We vindicate, for the first time, the rightness of the…

### MDL convergence speed for Bernoulli sequences

- Computer Science, MathematicsStat. Comput.
- 2006

The Minimum Description Length principle for online sequence estimation/prediction in a proper learning setup is studied and a new upper bound on the prediction error for countable Bernoulli classes is derived.

### Simplicity, information, Kolmogorov complexity and prediction

- Computer Science
- 2002

The relation between data compression and learning is treated and it is shown that compression is almost always the best strategy, both in hypotheses identiication by using the minimum description length (MDL) principle and in prediction methods in the style of R. Solomonoo.

### Minimum description length induction, Bayesianism, and Kolmogorov complexity

- Computer ScienceIEEE Trans. Inf. Theory
- 2000

In general, it is shown that data compression is almost always the best strategy, both in model selection and prediction.

### Algorithmic Complexity and Stochastic Properties of Finite Binary Sequences

- Computer Science, MathematicsComput. J.
- 1999

This paper is a survey of concepts and results related to simple Kolmogorov complexity, prefix complexity and resource bounded complexity. We consider also a new type of complexity statistical…

### Kolmogorov's structure functions and model selection

- Computer Science, MathematicsIEEE Transactions on Information Theory
- 2004

The goodness-of-fit of an individual model with respect to individual data is precisely quantify and it is shown that-within the obvious constraints-every graph is realized by the structure function of some data.

### On the concept of Bernoulliness

- Computer Science
- 2016

A natural definition of a finite Bernoulli sequence is given and it is compared with the Kolmogorov--Martin-Lof definition, which is interpreted as defining exchangeable sequences.

### Hypothesis Selection and Testing by the MDL Principle

- MathematicsComput. J.
- 1999

The central idea of the MDL (Minimum Description Length) principle is to represent a class of models (hypotheses) by a universal model capable of imitating the behavior of any model in the class. The…

## References

SHOWING 1-10 OF 10 REFERENCES

### Minimum description length estimators under the optimal coding scheme

- Computer ScienceEuroCOLT
- 1995

Following Rissanen we consider the statistical model {P θ | as a code-book, θ indexing the codes. To obtain a single code, we first encode some θ and then encode our data x with the code…

### Universal coding, information, prediction, and estimation

- Computer ScienceIEEE Trans. Inf. Theory
- 1984

A connection between universal codes and the problems of prediction and statistical estimation is established. A known lower bound for the mean length of universal codes is sharpened and generalized,…

### A learning criterion for stochastic rules

- Computer ScienceCOLT '90
- 1990

This paper derives target-dependent upper bounds and worst-case upper bounds on the sample size required by the MDL algorithm to learn stochastic rules with given accuracy and confidence.

### A UNIVERSAL PRIOR FOR INTEGERS AND ESTIMATION BY MINIMUM DESCRIPTION LENGTH

- Mathematics
- 1983

of the number of bits required to write down the observed data, has been reformulated to extend the classical maximum likelihood principle. The principle permits estimation of the number of the…

### An Introduction to Kolmogorov Complexity and Its Applications

- Computer ScienceTexts and Monographs in Computer Science
- 1993

The book presents a thorough treatment of the central ideas and their applications of Kolmogorov complexity with a wide range of illustrative applications, and will be ideal for advanced undergraduate students, graduate students, and researchers in computer science, mathematics, cognitive sciences, philosophy, artificial intelligence, statistics, and physics.

### Fisherian Inference in Likelihood and Prequential Frames of Reference

- Philosophy
- 1991

SUMMARY In celebration of the centenary of the birth of Sir Ronald Fisher, this paper explores Fisher's conception of statistical inference, with special attention to the importance he placed on…

### Stochastic Complexity and Modeling

- Mathematics
- 1986

On demontre un theoreme fondamental qui donne une borne inferieure pour la longueur de code et donc, pour les erreurs de prediction. On definit les notions «d'information a priori» et «d'information…

### Algorithmic entropy (complexity) of finite objects and its applications to defining randomness and amount of information

- Selecta Math. Soviet
- 1994

### Prequential analysis, stochastic complexity and Bayesian inference, in ``Bayesian Statistics 4'

- 1992

### Stochastic Complexity in Statistical Inquiry

- Computer ScienceWorld Scientific Series in Computer Science
- 1998