Minimum Description Length Revisited

@article{Grnwald2019MinimumDL,
  title={Minimum Description Length Revisited},
  author={Peter Gr{\"u}nwald and Teemu Roos},
  journal={ArXiv},
  year={2019},
  volume={abs/1908.08484}
}
This is an up-to-date introduction to and overview of the Minimum Description Length (MDL) Principle, a theory of inductive inference that can be applied to general problems in statistics, machine learning and pattern recognition. While MDL was originally based on data compression ideas, this introduction can be read without any knowledge thereof. It takes into account all major developments since 2007, the last time an extensive overview was written. These include new methods for model… 

Far from Asymptopia

A principled choice of measure is presented which avoids the bias in typical high-dimensional models, and leads to unbiased posteriors, by focusing on relevant parameters, and depends on the quantity of data gathered.

Marginal likelihood computation for model selection and hypothesis testing: an extensive review

This article provides a comprehensive study of the state-of-the-art of marginal likelihood computation for model selection and hypothesis testing, highlighting limitations, benefits, connections and differences among the different techniques.

Far from Asymptopia: Unbiased high-dimensional inference cannot assume unlimited data

A principled choice of measure is presented which avoids the bias in typical high-dimensional models, and leads to unbiased posteriors, by focusing on relevant parameters, and depends on the quantity of data gathered.

A Short Review on Minimum Description Length: An Application to Dimension Reduction in PCA

The basic ideas underlying the MDL criterion are reviewed and the role of MDL in the selection of the best principal components in the well known PCA is investigated.

Unsupervised Discretization by Two-dimensional MDL-based Histogram

This work extends the state of the art for the one-dimensional case to obtain a model selection problem based on the normalised maximum likelihood, a form of refined MDL, and introduces a heuristic algorithm, named PALM, which partitions each dimension alternately and then merges neighbouring regions, all using the MDL principle.

Exhaustive Symbolic Regression

A new method is introduced – Exhaustive Symbolic Regression (ESR) – which systematically and efficiently considers all possible equations and is therefore guaranteed to be not only the true optimum but also a complete function ranking.

Introduction to minimum message length inference

The Bayesian minimum message length principle of inductive inference is introduced to a general statistical audience that may not be familiar with information theoretic statistics.

Robust subgroup discovery

SSD++ is proposed, a greedy heuristic that finds good subgroup lists and guarantees that the most significant subgroup found according to the MDL criterion is added in each iteration.

Robust subgroup discovery

This work proposes SSD++, a greedy heuristic that finds good subgroup lists and guarantees that the most significant subgroup found according to the MDL criterion is added in each iteration, and empirically shows that SSD++ outperforms previous subgroup discovery methods in terms of quality, generalisation on unseen data, and subgroup list size.

Truly Unordered Probabilistic Rule Sets for Multi-class Classification

Compared to non-probabilistic and (explicitly or implicitly) ordered state-of-the-art methods, this method learns rule sets that not only have better interpretability but also better predictive performance.

References

SHOWING 1-10 OF 102 REFERENCES

Advances in Minimum Description Length: Theory and Applications

Advances in Minimum Description Length is a sourcebook that will introduce the scientific community to the foundations of MDL, recent theoretical advances, and practical applications, and examples of how to apply MDL in research settings that range from bioinformatics and machine learning to psychology.

A linear-time algorithm for computing the multinomial stochastic complexity

Viewing all models as “probabilistic”

Several theorems are presented that suggest that with the help of the mapping of models to codes, one can successfully learn using MDL and/or Bayesian methods when (1) almost arbitrary model classes and error functions are allowed, and (2) none of the models in the class under consideration are close to the ‘truth’ that generates the data.

Minimum description length induction, Bayesianism, and Kolmogorov complexity

In general, it is shown that data compression is almost always the best strategy, both in model selection and prediction.

Monte Carlo estimation of minimax regret with an application to MDL model selection

  • T. Roos
  • Computer Science, Mathematics
    2008 IEEE Information Theory Workshop
  • 2008
This work presents an approach based on Monte Carlo sampling, which works for all model classes, and gives strongly consistent estimators of the minimax regret, which is particularly efficient for the important class of Markov models.

A widely applicable Bayesian information criterion

A widely applicable Bayesian information criterion (WBIC) is defined by the average log likelihood function over the posterior distribution with the inverse temperature 1/log n, where n is the number of training samples and it is mathematically proved that WBIC has the same asymptotic expansion as the Bayes free energy, even if a statistical model is singular for or unrealizable by a statistical models.

High-dimensional penalty selection via minimum description length principle

A novel regularization selection method, in which a tight upper bound of LNML (uLNML) is minimized with local convergence guarantee, and the experimental results show that MDL-RS improves the generalization performance of regularized estimates specifically when the model has redundant parameters.

Information and Complexity in Statistical Modeling

Summary form only. Inspired by Kolmogorov's structure function for finite sets as models of data in the algorithmic theory of information we adapt the construct to families of probability models to

LEARNING BAYESIAN BELIEF NETWORKS: AN APPROACH BASED ON THE MDL PRINCIPLE

A new approach for learning Bayesian belief networks from raw data is presented, based on Rissanen's minimal description length (MDL) principle, which can learn unrestricted multiply‐connected belief networks and allows for trade off accuracy and complexity in the learned model.

Flat Minima

A new algorithm for finding low-complexity neural networks with high generalization capability that outperforms conventional backprop, weight decay, and optimal brain surgeon/optimal brain damage and requires the computation of second-order derivatives, but has backpropagation's order of complexity.
...