• Corpus ID: 214117204

# Risk of the Least Squares Minimum Norm Estimator under the Spike Covariance Model

@article{Mahdaviyeh2019RiskOT,
title={Risk of the Least Squares Minimum Norm Estimator under the Spike Covariance Model},
author={Yasaman Mahdaviyeh and Zacharie Naulet},
journal={arXiv: Machine Learning},
year={2019}
}
• Published 1 December 2019
• Mathematics, Computer Science
• arXiv: Machine Learning
We study risk of the minimum norm linear least squares estimator in when the number of parameters $d$ depends on $n$, and $\frac{d}{n} \rightarrow \infty$. We assume that data has an underlying low rank structure by restricting ourselves to spike covariance matrices, where a fixed finite number of eigenvalues grow with $n$ and are much larger than the rest of the eigenvalues, which are (asymptotically) in the same order. We show that in this setting risk of minimum norm least squares estimator…
4 Citations

### Towards an Understanding of Benign Overfitting in Neural Networks

• Computer Science
ArXiv
• 2021
It is shown that it is possible for the two-layer ReLU network interpolator to achieve a near minimax-optimal learning rate, which to this knowledge is the first generalization result for such networks.

### Support vector machines and linear regression coincide with very high-dimensional features

• Computer Science
NeurIPS
• 2021
A super-linear lower bound on the dimension (in terms of sample size) required for support vector proliferation in independent feature models is proved, matching the upper bounds from previous works.

### On the proliferation of support vectors in high dimensions

• Computer Science
AISTATS
• 2021
This paper identifies new deterministic equivalences for this phenomenon of support vector proliferation, and uses them to substantially broaden the conditions under which the phenomenon occurs in high-dimensional settings, and proves a nearly matching converse result.

### Classification vs regression in overparameterized regimes: Does the loss function matter?

• Computer Science
J. Mach. Learn. Res.
• 2021
This work compares classification and regression tasks in the overparameterized linear model with Gaussian features and demonstrates the very different roles and properties of loss functions used at the training phase (optimization) and the testing phase (generalization).

## References

SHOWING 1-10 OF 22 REFERENCES

### Benign overfitting in linear regression

• Computer Science
Proceedings of the National Academy of Sciences
• 2020
A characterization of linear regression problems for which the minimum norm interpolating prediction rule has near-optimal prediction accuracy shows that overparameterization is essential for benign overfitting in this setting: the number of directions in parameter space that are unimportant for prediction must significantly exceed the sample size.

### Distance-based and continuum Fano inequalities with applications to statistical estimation

• Mathematics
ArXiv
• 2013
Two extensions of the classical Fano inequality in information theory are given, providing lower bounds on the probability that an estimator of a discrete quantity is within some distance $t$ of the quantity.

### Two models of double descent for weak features

• Computer Science
SIAM J. Math. Data Sci.
• 2020
The "double descent" risk curve was recently proposed to qualitatively describe the out-of-sample prediction accuracy of variably-parameterized machine learning models and it is shown that the risk peaks when the number of features is close to the sample size, but also that therisk decreases towards its minimum as $p$ increases beyond $n$.

### Asymptotics and Concentration Bounds for Bilinear Forms of Spectral Projectors of Sample Covariance

• Mathematics
• 2014
Let $X,X_1,\dots, X_n$ be i.i.d. Gaussian random variables with zero mean and covariance operator $\Sigma={\mathbb E}(X\otimes X)$ taking values in a separable Hilbert space ${\mathbb H}.$ Let 

• 2007

### On the limit of the largest eigenvalue of the large dimensional sample covariance matrix

• Mathematics
• 1988
SummaryIn this paper the authors show that the largest eigenvalue of the sample covariance matrix tends to a limit under certain conditions when both the number of variables and the sample size tend

### The Statistics and Mathematics of High Dimension Low Sample Size Asymptotics.

• Computer Science, Mathematics
Statistica Sinica
• 2016
The new results reveal an asymptotic conical structure in critical sample eigendirections under the spike models with distinguishable eigenvalues, when the sample size and/or the number of variables (or dimension) tend to infinity.

### Geometric representation of high dimension, low sample size data

• Computer Science
• 2005
This analysis shows a tendency for the data to lie deterministically at the vertices of a regular simplex, which means all the randomness in the data appears only as a random rotation of this simplex.

### A General Framework for Consistency of Principal Component Analysis

• Computer Science
J. Mach. Learn. Res.
• 2016
This frame- work includes several previously studied domains of asymptotics as special cases and allows one to investigate interesting connections and transitions among the various domains and rigorously characterizes how their relationships affect PCA consistency.