• Corpus ID: 215754880

Replica analysis of overfitting in generalized linear models

@article{Coolen2020ReplicaAO,
  title={Replica analysis of overfitting in generalized linear models},
  author={A C C Coolen and Mansoor Sheikh and Alexander Mozeika and Fabi{\'a}n Aguirre-L{\'o}pez and Fabrizio Antenucci},
  journal={arXiv: Disordered Systems and Neural Networks},
  year={2020}
}
Nearly all statistical inference methods were developed for the regime where the number $N$ of data samples is much larger than the data dimension $p$. Inference protocols such as maximum likelihood (ML) or maximum a posteriori probability (MAP) are unreliable if $p=O(N)$, due to overfitting. This limitation has for many disciplines with increasingly high-dimensional data become a serious bottleneck. We recently showed that in Cox regression for time-to-event data the overfitting errors are not… 

Figures from this paper

References

SHOWING 1-10 OF 30 REFERENCES

The Impact of Regularization on High-dimensional Logistic Regression

This paper studies regularized logistic regression (RLR), where a convex regularizer that encourages the desired structure is added to the negative of the log-likelihood function, and provides a precise analysis of the performance of RLR via the solution of a system of six nonlinear equations.

Asymptotic errors for convex penalized linear regression beyond Gaussian matrices.

A rigorous derivation of an explicit formula for the asymptotic mean squared error obtained by penalized convex regression estimators such as the LASSO or the elastic net is provided, for a class of very generic random matrices corresponding to rotationally invariant data matrices with arbitrary spectrum.

Empirical bias-reducing adjustments to estimating functions

A novel, general framework for the asymptotic reduction of the bias of M-estimators from unbiased estimating functions is developed, establishing, for the first time, a strong link between reduction of estimation bias and model selection.

Information Theory, Inference, and Learning Algorithms

This book presents an interplay between the classical theory of general Lévy processes described by Skorohod (1991), Bertoin (1996), Sato (2003), and modern stochastic analysis as presented by Liptser and Shiryayev (1989), Protter (2004), and others.

Macroscopic Analysis of Vector Approximate Message Passing in a Model Mismatch Setting

This work derives state evolution equations, which macroscopically describe the dynamics of VAMP, and shows that their fixed point is consistent with the replica symmetric solution obtained by the replica method of statistical mechanics.

Analysis of Multivariate Survival Data

This book showed how one can implement all the above mentioned models in survival analysis through the S-plus and SAS packages, such as ordered and unordered multiple events per subject.

Generalized Linear Models

This is the Ž rst book on generalized linear models written by authors not mostly associated with the biological sciences, and it is thoroughly enjoyable to read.

Theory of Neural Information Processing Systems

This chapter discusses neural networks, Shannon's information theory, and applications to neural networks in the context of unsupervised and supervised learning.

The jackknife, the bootstrap, and other resampling plans

The Jackknife Estimate of Bias The Jackknife Estimate of Variance Bias of the Jackknife Variance Estimate The Bootstrap The Infinitesimal Jackknife The Delta Method and the Influence Function

Spin glasses : a challenge for mathematicians : cavity and mean field models

0. Introduction.- 1. A Toy Model, the REM.- 2. The Sherrington-Kirkpatrick Model.- 3. The Capacity of the Perceptron: The Ising Case.- 4. Capacity of the Perceptron: The Gaussian and the Spherical