• Corpus ID: 232035586

# On the Minimal Error of Empirical Risk Minimization

@article{Kur2021OnTM,
title={On the Minimal Error of Empirical Risk Minimization},
author={Gil Kur and Alexander Rakhlin},
journal={ArXiv},
year={2021},
volume={abs/2102.12066}
}
• Published 24 February 2021
• Computer Science
• ArXiv
We study the minimal error of the Empirical Risk Minimization (ERM) procedure in the task of regression, both in the random and the fixed design settings. Our sharp lower bounds shed light on the possibility (or impossibility) of adapting to simplicity of the model generating the data. In the fixed design setting, we show that the error is governed by the global complexity of the entire class. In contrast, in random design, ERM may only adapt to simpler models if the local neighborhoods around…
1 Citations

### Efficient Minimax Optimal Estimators For Multivariate Convex Regression

• Computer Science, Mathematics
COLT
• 2022
This work is the first to show the existence of efﬁcient minimax optimal estimators for non-Donsker classes that their corresponding Least Squares Estimators are provably minimax sub-optimal ; a result of independent interest.

## References

SHOWING 1-10 OF 39 REFERENCES

### A new perspective on least squares under convex constraint

This paper presents three general results about the problem of estimating the mean of a Gaussian random vector, including an exact computation of the main term in the estimation error by relating it to expected maxima of Gaussian processes, a theorem showing that the least squares estimator is always admissible up to a universal constant in any problem of the above kind and a counterexample showing that least squares estimating may not always be minimax rate-optimal.

### Empirical Processes in M-estimation, volume 6

• Cambridge university press,
• 2000

### Rates of convergence for minimum contrast estimators

• Mathematics
• 1993
SummaryWe shall present here a general study of minimum contrast estimators in a nonparametric setting (although our results are also valid in the classical parametric case) for independent

### Adaptation in multivariate log-concave density estimation

• Mathematics
The Annals of Statistics
• 2021
We study the adaptation properties of the multivariate log-concave maximum likelihood estimator over two subclasses of log-concave densities. The first consists of densities with polyhedral support

### Isotonic regression in general dimensions

• Mathematics
The Annals of Statistics
• 2019
We study the least squares regression function estimator over the class of real-valued functions on $[0,1]^d$ that are increasing in each coordinate. For uniformly bounded signals and with a fixed,

### Mathematical Foundations of Infinite-Dimensional Statistical Models

• Mathematics, Computer Science
• 2015
This chapter discusses nonparametric statistical models, function spaces and approximation theory, and the minimax paradigm, which aims to provide a model for adaptive inference oflihood-based procedures.

### Concentration Inequalities - A Nonasymptotic Theory of Independence

• Mathematics
Concentration Inequalities
• 2013
Deep connections with isoperimetric problems are revealed whilst special attention is paid to applications to the supremum of empirical processes.

### On Suboptimality of Least Squares with Application to Estimation of Convex Bodies

• Mathematics, Computer Science
COLT
• 2020
It is established that Least Squares is mimimax sub-optimal, and achieves a rate of $\tilde{\Theta}_d (n-2/(d-1)})$ whereas the minimax rate is $\Theta_d(n^{-4/(d+3)})$.

### Convex Regression in Multidimensions: Suboptimality of Least Squares Estimators

• Mathematics
• 2020
The least squares estimator (LSE) is shown to be suboptimal in squared error loss in the usual nonparametric regression model with Gaussian errors for $d \geq 5$ for each of the following families of

### Benign overfitting in ridge regression

• Computer Science
• 2020
This work provides non-asymptotic generalization bounds for overparametrized ridge regression that depend on the arbitrary covariance structure of the data, and shows that those bounds are tight for a range of regularization parameter values.