# Minimum description length induction, Bayesianism, and Kolmogorov complexity

@article{Vitnyi2000MinimumDL,
title={Minimum description length induction, Bayesianism, and Kolmogorov complexity},
author={Paul M. B. Vit{\'a}nyi and Ming Li},
journal={ArXiv},
year={2000},
volume={cs.LG/9901014}
}
• Published 27 January 1999
• Computer Science
• ArXiv
The relationship between the Bayesian approach and the minimum description length approach is established. We sharpen and clarify the general modeling principles minimum description length (MDL) and minimum message length (MML), abstracted as the ideal MDL principle and defined from Bayes's rule by means of Kolmogorov complexity. The basic condition under which the ideal principle should be applied is encapsulated as the fundamental inequality, which in broad terms states that the principle is…
262 Citations

### MDL induction, Bayesianism, and Kolmogorov complexity

• Computer Science
Proceedings. 1998 IEEE International Symposium on Information Theory (Cat. No.98CH36252)
• 1998
The relationship between the Bayesian approach and the minimum description length approach is established and shows that data compression is almost always the best strategy, both in hypothesis identification and prediction.

### Advances in Minimum Description Length: Theory and Applications

• Computer Science
• 2005
Advances in Minimum Description Length is a sourcebook that will introduce the scientific community to the foundations of MDL, recent theoretical advances, and practical applications, and examples of how to apply MDL in research settings that range from bioinformatics and machine learning to psychology.

### Simplicity, information, Kolmogorov complexity and prediction

• Computer Science
• 2002
The relation between data compression and learning is treated and it is shown that compression is almost always the best strategy, both in hypotheses identiication by using the minimum description length (MDL) principle and in prediction methods in the style of R. Solomonoo.

### Kolmogorov's structure functions and model selection

• Computer Science, Mathematics
IEEE Transactions on Information Theory
• 2004
The goodness-of-fit of an individual model with respect to individual data is precisely quantify and it is shown that-within the obvious constraints-every graph is realized by the structure function of some data.

### Meaningful Information

• P. Vitányi
• Computer Science, Mathematics
IEEE Transactions on Information Theory
• 2006
The theory of recursive functions statistic, the maximum and minimum value, the existence of absolutely nonstochastic objects (that have maximal sophistication-all the information in them is meaningful and there is no residual randomness), and the relation to the halting problem and further algorithmic properties are developed.

### Kolmogorov's structure functions with an application to the foundations of model selection

• Computer Science, Mathematics
The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings.
• 2002
Kolmogorov (1974) proposed a non-probabilistic approach to statistics, an individual combinatorial relation between the data and its model. We vindicate, for the first time, the rightness of the

### Algorithmic statistics

• Computer Science
IEEE Trans. Inf. Theory
• 2001
The algorithmic theory of statistic, sufficient statistic, and minimal sufficient statistic is developed and it is shown that a function is a probabilistic sufficient statistic iff it is with high probability (in an appropriate sense) an algorithmic sufficient statistic.

### Minimum Description Length Revisited

• Computer Science
International Journal of Mathematics for Industry
• 2019
This is an up-to-date introduction to and overview of the Minimum Description Length (MDL) Principle, a theory of inductive inference that can be applied to general problems in statistics, machine

### Applying MDL to Learning Best Model Granularity

• Computer Science
ArXiv
• 2000
This work test how the theory of the Minimum Description Length behaves in practice on a general problem in model selection: that of learning the best model granularity, which depends critically on the granularity of the parameters.

### On the Convergence Speed of MDL Predictions for Bernoulli Sequences

• Computer Science, Mathematics
ALT
• 2004
A new upper bound on the prediction error for countable Bernoulli classes is derived, which implies a small bound (comparable to the one for Bayes mixtures) for certain important model classes.

## References

SHOWING 1-10 OF 107 REFERENCES

### KOLMOGOROV'S CONTRIBUTIONS TO INFORMATION THEORY AND ALGORITHMIC COMPLEXITY

• Mathematics, Computer Science
• 1989
If the authors let Pu(x) = Pr{U prints x} be the probability that a given computer U prints x when given a random program, it can be shown that log(1/Pu(x)) - K( x) for all x, thus establishing a vital link between the "universal" probability measure Pu and the " universal" complexity K.

### The Minimum Description Length Principle in Coding and Modeling

• Computer Science
• 2000
The normalized maximized likelihood, mixture, and predictive codings are each shown to achieve the stochastic complexity to within asymptotically vanishing terms.

### Complexity-based induction systems: Comparisons and convergence theorems

Levin has shown that if tilde{P}'_{M}(x) is an unnormalized form of this measure, and P( x) is any computable probability measure on strings, x, then \tilde{M}'_M}\geqCP (x) where C is a constant independent of x .

### Inductive reasoning and Kolmogorov complexity

• Computer Science
[1989] Proceedings. Structure in Complexity Theory Fourth Annual Conference
• 1989
The thesis is developed that Solomonoff's method is fundamental in the sense that many other induction principles can be viewed as particular ways to obtain computable approximations to it.

### Minimum complexity density estimation

• Computer Science
IEEE Trans. Inf. Theory
• 1991
An index of resolvability is proved to bound the rate of convergence of minimum complexity density estimators as well as the information-theoretic redundancy of the corresponding total description length to demonstrate the statistical effectiveness of the minimum description-length principle as a method of inference.

### An Introduction to Kolmogorov Complexity and Its Applications

• Computer Science
Texts and Monographs in Computer Science
• 1993
The book presents a thorough treatment of the central ideas and their applications of Kolmogorov complexity with a wide range of illustrative applications, and will be ideal for advanced undergraduate students, graduate students, and researchers in computer science, mathematics, cognitive sciences, philosophy, artificial intelligence, statistics, and physics.

### Learning about the Parameter of the Bernoulli Model

• V. Vovk
• Computer Science, Mathematics
J. Comput. Syst. Sci.
• 1997
We consider the problem of learning as much information as possible about the parameter?of the Bernoulli model {P????0, 1} from the statistical datax?{0, 1}n,n?1 being the sample size. Explicating

### Kolmogorov Complexity, Data Compression, and Inference

If a sequence of random variables has Shannon entropy H, it is well known that there exists an efficient description of this sequence which requires only H bits. But the entropy H of a sequence also