Corpus ID: 219176968

On lower bounds for the bias-variance trade-off

  title={On lower bounds for the bias-variance trade-off},
  author={A. Derumigny and Johannes Schmidt-Hieber},
  journal={arXiv: Statistics Theory},
It is a common phenomenon that for high-dimensional and nonparametric statistical models, rate-optimal estimators balance squared bias and variance. Although this balancing is widely observed, little is known whether methods exist that could avoid the trade-off between bias and variance. We propose a general strategy to obtain lower bounds on the variance of any estimator with bias smaller than a prespecified bound. This shows to which extent the bias-variance trade-off is unavoidable and… Expand
1 Citations

Figures and Tables from this paper

From inexact optimization to learning via gradient concentration
This paper shows how probabilistic results, specifically gradient concentration, can be combined with results from inexact optimization to derive sharp test error guarantees and highlights the implicit regularization properties of optimization for learning. Expand


On Bayes Risk Lower Bounds
A general technique for lower bounding the Bayes risk of statistical estimation, applicable to arbitrary loss functions and arbitrary prior distributions, and leads to generalizations of a variety of classical minimax bounds. Expand
Bias-Variance trade-off characterization in a classification problem What differences with regression ?
An important issue in machine learning theory is the so-called biasvariance trade-off, which shows that a model with a high degree of freedom has often poor generalization capabilities. In learningExpand
A Modern Take on the Bias-Variance Tradeoff in Neural Networks
It is found that both bias and variance can decrease as the number of parameters grows, and a new decomposition of the variance is introduced to disentangle the effects of optimization and data sampling. Expand
Decomposition of Kullback–Leibler risk and unbiasedness for parameter-free estimators
Abstract The bias and variance of traditional parameter estimators are parameter-dependent quantities. The maximum likelihood estimate (MLE) can be defined directly on a family of distributions P andExpand
On Bias Plus Variance
  • D. Wolpert
  • Mathematics, Computer Science
  • Neural Computation
  • 1997
This article presents several additive corrections to the conventional quadratic loss bias-plus-variance formula. One of these corrections is appropriate when both the target is not fixed (as inExpand
Variance and Bias for General Loss Functions
This paper suggests an explicit list of rules that any “reasonable” set of definitions should satisfy and produces bias and variance definitions which generalize to any symmetric loss function. Expand
General bias/variance decomposition with target independent variance of error functions derived from the exponential family of distributions
  • J. Hansen, T. Heskes
  • Computer Science
  • Proceedings 15th International Conference on Pattern Recognition. ICPR-2000
  • 2000
It is proved that this family of error functions contains all error functions decomposable in that manner and presents a useful approximation of ambiguity that is quadratic in the ensemble coefficients. Expand
A Unified Bias-Variance Decomposition for Zero-One and Squared Loss
This paper defines bias and variance for an arbitrary loss function, and shows that the resulting decomposition specializes to the standard one for the squared-loss case, and to a close relative of Kong and Dietterich’ s (1995)One for the zero-one case. Expand
Reconciling modern machine learning practice and the bias-variance trade-off
This paper reconciles the classical understanding and the modern practice within a unified performance curve that subsumes the textbook U-shaped bias-variance trade-off curve by showing how increasing model capacity beyond the point of interpolation results in improved performance. Expand
Confidence intervals for high-dimensional linear regression: Minimax rates and adaptivity
Confidence sets play a fundamental role in statistical inference. In this paper, we consider confidence intervals for high dimensional linear regression with random design. We first establish theExpand