Learning about the Parameter of the Bernoulli Model

  title={Learning about the Parameter of the Bernoulli Model},
  author={Vladimir Vovk},
  journal={J. Comput. Syst. Sci.},
  • V. Vovk
  • Published 1 August 1997
  • Computer Science, Mathematics
  • J. Comput. Syst. Sci.
We consider the problem of learning as much information as possible about the parameter?of the Bernoulli model {P????0, 1} from the statistical datax?{0, 1}n,n?1 being the sample size. Explicating this problem in terms of the Kolmogorov complexity and Rissanen's minimum description length principle, we construct a computable point estimator which (a) extracts from x all information it contains about?, and (b) discards all sample noise inx. Our result is closely connected with Rissanen's theorem… 

On the Convergence Speed of MDL Predictions for Bernoulli Sequences

A new upper bound on the prediction error for countable Bernoulli classes is derived, which implies a small bound (comparable to the one for Bayes mixtures) for certain important model classes.

Bayes code with singular prior

The code shown in this paper has a shorter code length than MDL coding (log coefficient less than k/2) when the parameters are in a set of measure 0, and is considered to be intermediate between Shannon coding andMDL coding.

Kolmogorov's structure functions with an application to the foundations of model selection

  • N. VereshchaginP. Vitányi
  • Computer Science, Mathematics
    The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings.
  • 2002
Kolmogorov (1974) proposed a non-probabilistic approach to statistics, an individual combinatorial relation between the data and its model. We vindicate, for the first time, the rightness of the

MDL convergence speed for Bernoulli sequences

The Minimum Description Length principle for online sequence estimation/prediction in a proper learning setup is studied and a new upper bound on the prediction error for countable Bernoulli classes is derived.

Simplicity, information, Kolmogorov complexity and prediction

The relation between data compression and learning is treated and it is shown that compression is almost always the best strategy, both in hypotheses identiication by using the minimum description length (MDL) principle and in prediction methods in the style of R. Solomonoo.

Minimum description length induction, Bayesianism, and Kolmogorov complexity

In general, it is shown that data compression is almost always the best strategy, both in model selection and prediction.

Algorithmic Complexity and Stochastic Properties of Finite Binary Sequences

  • V. V'yugin
  • Computer Science, Mathematics
    Comput. J.
  • 1999
This paper is a survey of concepts and results related to simple Kolmogorov complexity, prefix complexity and resource bounded complexity. We consider also a new type of complexity statistical

Kolmogorov's structure functions and model selection

The goodness-of-fit of an individual model with respect to individual data is precisely quantify and it is shown that-within the obvious constraints-every graph is realized by the structure function of some data.

On the concept of Bernoulliness

A natural definition of a finite Bernoulli sequence is given and it is compared with the Kolmogorov--Martin-Lof definition, which is interpreted as defining exchangeable sequences.

Hypothesis Selection and Testing by the MDL Principle

The central idea of the MDL (Minimum Description Length) principle is to represent a class of models (hypotheses) by a universal model capable of imitating the behavior of any model in the class. The



Minimum description length estimators under the optimal coding scheme

  • V. Vovk
  • Computer Science
  • 1995
Following Rissanen we consider the statistical model {P θ | as a code-book, θ indexing the codes. To obtain a single code, we first encode some θ and then encode our data x with the code

Universal coding, information, prediction, and estimation

A connection between universal codes and the problems of prediction and statistical estimation is established. A known lower bound for the mean length of universal codes is sharpened and generalized,

A learning criterion for stochastic rules

This paper derives target-dependent upper bounds and worst-case upper bounds on the sample size required by the MDL algorithm to learn stochastic rules with given accuracy and confidence.


of the number of bits required to write down the observed data, has been reformulated to extend the classical maximum likelihood principle. The principle permits estimation of the number of the

An Introduction to Kolmogorov Complexity and Its Applications

The book presents a thorough treatment of the central ideas and their applications of Kolmogorov complexity with a wide range of illustrative applications, and will be ideal for advanced undergraduate students, graduate students, and researchers in computer science, mathematics, cognitive sciences, philosophy, artificial intelligence, statistics, and physics.

Fisherian Inference in Likelihood and Prequential Frames of Reference

SUMMARY In celebration of the centenary of the birth of Sir Ronald Fisher, this paper explores Fisher's conception of statistical inference, with special attention to the importance he placed on

Stochastic Complexity and Modeling

On demontre un theoreme fondamental qui donne une borne inferieure pour la longueur de code et donc, pour les erreurs de prediction. On definit les notions «d'information a priori» et «d'information

Algorithmic entropy (complexity) of finite objects and its applications to defining randomness and amount of information

  • Selecta Math. Soviet
  • 1994

Prequential analysis, stochastic complexity and Bayesian inference, in ``Bayesian Statistics 4'

  • 1992

Stochastic Complexity in Statistical Inquiry

  • J. Rissanen
  • Computer Science
    World Scientific Series in Computer Science
  • 1998