# Minimum Description Length Revisited

@article{Grnwald2019MinimumDL, title={Minimum Description Length Revisited}, author={Peter Gr{\"u}nwald and Teemu Roos}, journal={ArXiv}, year={2019}, volume={abs/1908.08484} }

This is an up-to-date introduction to and overview of the Minimum Description Length (MDL) Principle, a theory of inductive inference that can be applied to general problems in statistics, machine learning and pattern recognition. While MDL was originally based on data compression ideas, this introduction can be read without any knowledge thereof. It takes into account all major developments since 2007, the last time an extensive overview was written. These include new methods for model…

## 38 Citations

### Far from Asymptopia

- Computer ScienceArXiv
- 2022

A principled choice of measure is presented which avoids the bias in typical high-dimensional models, and leads to unbiased posteriors, by focusing on relevant parameters, and depends on the quantity of data gathered.

### Marginal likelihood computation for model selection and hypothesis testing: an extensive review

- Computer ScienceArXiv
- 2020

This article provides a comprehensive study of the state-of-the-art of marginal likelihood computation for model selection and hypothesis testing, highlighting limitations, benefits, connections and differences among the different techniques.

### Far from Asymptopia: Unbiased high-dimensional inference cannot assume unlimited data

- Computer Science
- 2022

A principled choice of measure is presented which avoids the bias in typical high-dimensional models, and leads to unbiased posteriors, by focusing on relevant parameters, and depends on the quantity of data gathered.

### A Short Review on Minimum Description Length: An Application to Dimension Reduction in PCA

- Computer ScienceEntropy
- 2022

The basic ideas underlying the MDL criterion are reviewed and the role of MDL in the selection of the best principal components in the well known PCA is investigated.

### Unsupervised Discretization by Two-dimensional MDL-based Histogram

- Computer ScienceArXiv
- 2020

This work extends the state of the art for the one-dimensional case to obtain a model selection problem based on the normalised maximum likelihood, a form of refined MDL, and introduces a heuristic algorithm, named PALM, which partitions each dimension alternately and then merges neighbouring regions, all using the MDL principle.

### Exhaustive Symbolic Regression

- Computer ScienceArXiv
- 2022

A new method is introduced – Exhaustive Symbolic Regression (ESR) – which systematically and efﬁciently considers all possible equations and is therefore guaranteed to be not only the true optimum but also a complete function ranking.

### Introduction to minimum message length inference

- Computer Science
- 2022

The Bayesian minimum message length principle of inductive inference is introduced to a general statistical audience that may not be familiar with information theoretic statistics.

### Robust subgroup discovery

- Computer ScienceData Mining and Knowledge Discovery
- 2022

SSD++ is proposed, a greedy heuristic that finds good subgroup lists and guarantees that the most significant subgroup found according to the MDL criterion is added in each iteration.

### Robust subgroup discovery

- Computer Science
- 2021

This work proposes SSD++, a greedy heuristic that finds good subgroup lists and guarantees that the most significant subgroup found according to the MDL criterion is added in each iteration, and empirically shows that SSD++ outperforms previous subgroup discovery methods in terms of quality, generalisation on unseen data, and subgroup list size.

### Truly Unordered Probabilistic Rule Sets for Multi-class Classification

- Computer ScienceArXiv
- 2022

Compared to non-probabilistic and (explicitly or implicitly) ordered state-of-the-art methods, this method learns rule sets that not only have better interpretability but also better predictive performance.

## References

SHOWING 1-10 OF 102 REFERENCES

### Advances in Minimum Description Length: Theory and Applications

- Computer Science
- 2005

Advances in Minimum Description Length is a sourcebook that will introduce the scientific community to the foundations of MDL, recent theoretical advances, and practical applications, and examples of how to apply MDL in research settings that range from bioinformatics and machine learning to psychology.

### A linear-time algorithm for computing the multinomial stochastic complexity

- Computer ScienceInf. Process. Lett.
- 2007

### Viewing all models as “probabilistic”

- Computer ScienceCOLT '99
- 1999

Several theorems are presented that suggest that with the help of the mapping of models to codes, one can successfully learn using MDL and/or Bayesian methods when (1) almost arbitrary model classes and error functions are allowed, and (2) none of the models in the class under consideration are close to the ‘truth’ that generates the data.

### Minimum description length induction, Bayesianism, and Kolmogorov complexity

- Computer ScienceIEEE Trans. Inf. Theory
- 2000

In general, it is shown that data compression is almost always the best strategy, both in model selection and prediction.

### Monte Carlo estimation of minimax regret with an application to MDL model selection

- Computer Science, Mathematics2008 IEEE Information Theory Workshop
- 2008

This work presents an approach based on Monte Carlo sampling, which works for all model classes, and gives strongly consistent estimators of the minimax regret, which is particularly efficient for the important class of Markov models.

### A widely applicable Bayesian information criterion

- Mathematics, Computer ScienceJ. Mach. Learn. Res.
- 2013

A widely applicable Bayesian information criterion (WBIC) is defined by the average log likelihood function over the posterior distribution with the inverse temperature 1/log n, where n is the number of training samples and it is mathematically proved that WBIC has the same asymptotic expansion as the Bayes free energy, even if a statistical model is singular for or unrealizable by a statistical models.

### High-dimensional penalty selection via minimum description length principle

- Computer ScienceMachine Learning
- 2018

A novel regularization selection method, in which a tight upper bound of LNML (uLNML) is minimized with local convergence guarantee, and the experimental results show that MDL-RS improves the generalization performance of regularized estimates specifically when the model has redundant parameters.

### Information and Complexity in Statistical Modeling

- Computer ScienceITW
- 2006

Summary form only. Inspired by Kolmogorov's structure function for finite sets as models of data in the algorithmic theory of information we adapt the construct to families of probability models to…

### LEARNING BAYESIAN BELIEF NETWORKS: AN APPROACH BASED ON THE MDL PRINCIPLE

- Computer ScienceComput. Intell.
- 1994

A new approach for learning Bayesian belief networks from raw data is presented, based on Rissanen's minimal description length (MDL) principle, which can learn unrestricted multiply‐connected belief networks and allows for trade off accuracy and complexity in the learned model.

### Flat Minima

- Computer ScienceNeural Computation
- 1997

A new algorithm for finding low-complexity neural networks with high generalization capability that outperforms conventional backprop, weight decay, and optimal brain surgeon/optimal brain damage and requires the computation of second-order derivatives, but has backpropagation's order of complexity.