# Optimal regularizations for data generation with probabilistic graphical models

@article{Fanthomme2021OptimalRF,
title={Optimal regularizations for data generation with probabilistic graphical models},
author={Arnaud Fanthomme and Felipe B. Rizzato and Simona Cocco and R{\'e}mi Monasson},
journal={Journal of Statistical Mechanics: Theory and Experiment},
year={2021},
volume={2022}
}
• Published 2 December 2021
• Computer Science
• Journal of Statistical Mechanics: Theory and Experiment
Understanding the role of regularization is a central question in statistical inference. Empirically, well-chosen regularization schemes often dramatically improve the quality of the inferred models by avoiding overfitting of the training data. We consider here the particular case of L 2 regularization in the maximum a posteriori (MAP) inference of generative pairwise graphical models. Based on analytical calculations on Gaussian multivariate distributions and numerical experiments on Gaussian…

## References

SHOWING 1-10 OF 44 REFERENCES

• Computer Science
bioRxiv
• 2016
The adaptive cluster expansion (ACE) method to quickly and accurately infer Ising or Potts models based on correlation data is described and it is shown that models inferred by ACE have substantially better statistical performance compared to those obtained from faster Gaussian and pseudo-likelihood methods.
• Computer Science
Physical review. E, Statistical, nonlinear, and soft matter physics
• 2014
It is argued, based on the analysis of small systems, that the optimal value of the regularization strength remains finite even if the sampling noise tends to zero, in order to correct for systematic biases introduced by the MF approximation.
• Computer Science
Physical review. E
• 2020
A double regularization scheme, in which the number of Potts states (colors) available to each variable is reduced and interaction networks are made sparse, is studied, which shows in particular that color compression does not affect the quality of reconstruction of the parameters corresponding to high-frequency symbols, while drastically reducing thenumber of the other parameters and thus the computational time.
• Computer Science
ICML
• 2020
A closed-form expression for the asymptotic generalisation performance in generalised linear regression and classification for a synthetically generated dataset encompassing different problems of interest, such as learning with random features, neural networks in the lazy training regime, and the hidden manifold model is provided.
• Computer Science
ArXiv
• 2021
This paper provides a succinct overview of this emerging theory of overparameterized ML (henceforth abbreviated as TOPML) that explains these recent findings through a statistical signal processing perspective and emphasizes the unique aspects that define the TOPML research area as a subfield of modern ML theory.
• Computer Science
Acta Numerica
• 2021
This article surveys recent progress in statistical learning theory that provides examples illustrating these principles in simpler settings, and focuses specifically on the linear regime for neural networks, where the network can be approximated by a linear model.
• Computer Science, Mathematics
ICLR
• 2018
A practical method for L_0 norm regularization for neural networks: pruning the network during training by encouraging weights to become exactly zero, which allows for straightforward and efficient learning of model structures with stochastic gradient descent and allows for conditional computation in a principled way.
• Computer Science, Mathematics
ICML
• 2020
A rigorous analysis of the generalization error of regularized convex classifiers, including ridge, hinge and logistic regression, in the high-dimensional limit where the number of samples and their dimension goes to infinity while their ratio is fixed to $\alpha= n/d$.
• Computer Science
Proceedings of the National Academy of Sciences
• 2020
A characterization of linear regression problems for which the minimum norm interpolating prediction rule has near-optimal prediction accuracy shows that overparameterization is essential for benign overfitting in this setting: the number of directions in parameter space that are unimportant for prediction must significantly exceed the sample size.
• Computer Science, Mathematics
• 2008
The first result establishes consistency of the estimate b � in the elementwise maximum-norm, which allows us to derive convergence rates in Frobenius and spectral norms, and shows good correspondences between the theoretical predictions and behavior in simulations.