Optimal regularizations for data generation with probabilistic graphical models
@article{Fanthomme2021OptimalRF, title={Optimal regularizations for data generation with probabilistic graphical models}, author={Arnaud Fanthomme and Felipe B. Rizzato and Simona Cocco and R{\'e}mi Monasson}, journal={Journal of Statistical Mechanics: Theory and Experiment}, year={2021}, volume={2022} }
Understanding the role of regularization is a central question in statistical inference. Empirically, well-chosen regularization schemes often dramatically improve the quality of the inferred models by avoiding overfitting of the training data. We consider here the particular case of L 2 regularization in the maximum a posteriori (MAP) inference of generative pairwise graphical models. Based on analytical calculations on Gaussian multivariate distributions and numerical experiments on Gaussian…
References
SHOWING 1-10 OF 44 REFERENCES
ACE: adaptive cluster expansion for maximum entropy graphical model inference
- Computer SciencebioRxiv
- 2016
The adaptive cluster expansion (ACE) method to quickly and accurately infer Ising or Potts models based on correlation data is described and it is shown that models inferred by ACE have substantially better statistical performance compared to those obtained from faster Gaussian and pseudo-likelihood methods.
Large pseudocounts and L2-norm penalties are necessary for the mean-field inference of Ising and Potts models.
- Computer SciencePhysical review. E, Statistical, nonlinear, and soft matter physics
- 2014
It is argued, based on the analysis of small systems, that the optimal value of the regularization strength remains finite even if the sampling noise tends to zero, in order to correct for systematic biases introduced by the MF approximation.
Inference of compressed Potts graphical models.
- Computer SciencePhysical review. E
- 2020
A double regularization scheme, in which the number of Potts states (colors) available to each variable is reduced and interaction networks are made sparse, is studied, which shows in particular that color compression does not affect the quality of reconstruction of the parameters corresponding to high-frequency symbols, while drastically reducing thenumber of the other parameters and thus the computational time.
Generalisation error in learning with random features and the hidden manifold model
- Computer ScienceICML
- 2020
A closed-form expression for the asymptotic generalisation performance in generalised linear regression and classification for a synthetically generated dataset encompassing different problems of interest, such as learning with random features, neural networks in the lazy training regime, and the hidden manifold model is provided.
A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of Overparameterized Machine Learning
- Computer ScienceArXiv
- 2021
This paper provides a succinct overview of this emerging theory of overparameterized ML (henceforth abbreviated as TOPML) that explains these recent findings through a statistical signal processing perspective and emphasizes the unique aspects that define the TOPML research area as a subfield of modern ML theory.
Deep learning: a statistical viewpoint
- Computer ScienceActa Numerica
- 2021
This article surveys recent progress in statistical learning theory that provides examples illustrating these principles in simpler settings, and focuses specifically on the linear regime for neural networks, where the network can be approximated by a linear model.
Learning Sparse Neural Networks through L0 Regularization
- Computer Science, MathematicsICLR
- 2018
A practical method for L_0 norm regularization for neural networks: pruning the network during training by encouraging weights to become exactly zero, which allows for straightforward and efficient learning of model structures with stochastic gradient descent and allows for conditional computation in a principled way.
The role of regularization in classification of high-dimensional noisy Gaussian mixture
- Computer Science, MathematicsICML
- 2020
A rigorous analysis of the generalization error of regularized convex classifiers, including ridge, hinge and logistic regression, in the high-dimensional limit where the number of samples and their dimension goes to infinity while their ratio is fixed to $\alpha= n/d$.
Benign overfitting in linear regression
- Computer ScienceProceedings of the National Academy of Sciences
- 2020
A characterization of linear regression problems for which the minimum norm interpolating prediction rule has near-optimal prediction accuracy shows that overparameterization is essential for benign overfitting in this setting: the number of directions in parameter space that are unimportant for prediction must significantly exceed the sample size.
High-dimensional covariance estimation by minimizing ℓ1-penalized log-determinant divergence
- Computer Science, Mathematics
- 2008
The first result establishes consistency of the estimate b � in the elementwise maximum-norm, which allows us to derive convergence rates in Frobenius and spectral norms, and shows good correspondences between the theoretical predictions and behavior in simulations.