# On lower bounds for the bias-variance trade-off

@article{Derumigny2020OnLB, title={On lower bounds for the bias-variance trade-off}, author={A. Derumigny and Johannes Schmidt-Hieber}, journal={arXiv: Statistics Theory}, year={2020} }

It is a common phenomenon that for high-dimensional and nonparametric statistical models, rate-optimal estimators balance squared bias and variance. Although this balancing is widely observed, little is known whether methods exist that could avoid the trade-off between bias and variance. We propose a general strategy to obtain lower bounds on the variance of any estimator with bias smaller than a prespecified bound. This shows to which extent the bias-variance trade-off is unavoidable and…

## One Citation

### From inexact optimization to learning via gradient concentration

- Computer ScienceComputational Optimization and Applications
- 2022

It is shown how probabilistic results, specifically gradient concentration, can be combined with results from inexact optimization to derive sharp test error guarantees and highlights the implicit regularization properties of optimization for learning.

## References

SHOWING 1-10 OF 66 REFERENCES

### Notes on the bias-variance trade-off phenomenon

- Mathematics, Economics
- 2004

: The main inequality (Theorem 1) here involves the Hellinger dis- tance of a statistical model of an observation X , which imposes bounds on the mean of any estimator in terms of its variance. We…

### On Bayes Risk Lower Bounds

- Computer ScienceJ. Mach. Learn. Res.
- 2016

A general technique for lower bounding the Bayes risk of statistical estimation, applicable to arbitrary loss functions and arbitrary prior distributions, and leads to generalizations of a variety of classical minimax bounds.

### Bias-Variance trade-off characterization in a classification problem What differences with regression ?

- Computer Science
- 2005

Two major interests of this theoretical account on bias-variance decomposition are: first, that the notion of bias needs to be redefined in classification problems and, second, that given appropriate definitions of noise, bias, and variance, it is possible to unify different decompositions in a nice general theoretical framework.

### A Modern Take on the Bias-Variance Tradeoff in Neural Networks

- Computer ScienceArXiv
- 2018

It is found that both bias and variance can decrease as the number of parameters grows, and a new decomposition of the variance is introduced to disentangle the effects of optimization and data sampling.

### Decomposition of Kullback–Leibler risk and unbiasedness for parameter-free estimators

- Mathematics
- 2012

### On Bias Plus Variance

- MathematicsNeural Computation
- 1997

This article presents several additive corrections to the conventional quadratic loss bias-plus-variance formula. One of these corrections is appropriate when both the target is not fixed (as in…

### Variance and Bias for General Loss Functions

- Computer ScienceMachine Learning
- 2004

This paper suggests an explicit list of rules that any “reasonable” set of definitions should satisfy and produces bias and variance definitions which generalize to any symmetric loss function.

### A Unified Bias-Variance Decomposition for Zero-One and Squared Loss

- Computer ScienceAAAI/IAAI
- 2000

This paper defines bias and variance for an arbitrary loss function, and shows that the resulting decomposition specializes to the standard one for the squared-loss case, and to a close relative of Kong and Dietterich’ s (1995)One for the zero-one case.

### Confidence intervals for high-dimensional linear regression: Minimax rates and adaptivity

- Computer Science, Mathematics
- 2015

This paper establishes the convergence rates of the minimax expected length for confidence intervals in the oracle setting where the sparsity parameter is given, and focuses on the problem of adaptation to sparsity for the construction of confidence intervals.

### INADMISSIBILITY OF THE USUAL ESTIMATOR FOR THE MEAN OF A MULTIVARIATE NORMAL DISTRIBUTION

- Mathematics
- 2005

If one observes the real random variables Xi, X,, independently normally distributed with unknown means ti, *, {n and variance 1, it is customary to estimate (i by Xi. If the loss is the sum of…