Risk bounds for model selection via penalization
@article{Barron1999RiskBF, title={Risk bounds for model selection via penalization}, author={Andrew R. Barron and L. Birge and Pascal Massart}, journal={Probability Theory and Related Fields}, year={1999}, volume={113}, pages={301-413} }
Abstract Performance bounds for criteria for model selection are developed using recent theory for sieves. The model selection criteria are based on an empirical loss or contrast function with an added penalty term motivated by empirical process theory and roughly proportional to the number of parameters needed to describe the model divided by the number of observations. Most of our examples involve density or regression estimation settings and we focus on the problem of estimating the unknown…
783 Citations
Risk of penalized least squares, greedy selection andl 1-penalization for flexible function libraries
- Computer Science
- 2008
This paper analyzes performance of penalized least squares estimators with theory of acceptable penalties, such that the estimator optimizing the empirical criterion has risk characterized by a corresponding population property of tradeoff of approximation and penalty relative to the sample size.
Model Selection and Error Estimation
- Computer Science, MathematicsMachine Learning
- 2004
A tight relationship between error estimation and data-based complexity penalization is pointed out: any good error estimate may be converted into a data- based penalty function and the performance of the estimate is governed by the quality of the error estimate.
MODEL SELECTION FOR NONPARAMETRIC REGRESSION
- Computer Science, Mathematics
- 1997
A model complexity penalty term in AIC is incorporated to handle selec- tion bias and resulting estimators are shown to achieve a trade-off among approxima- tion error, estimation error and model complexity without prior knowledge about the true regression function.
Model Selection for Nonparametric Regression Model Selection for Regression
- Computer Science, Mathematics
- 1998
A model complexity penalty term in AIC is incorporated to handle the selection bias in regression estimation, and resulting estimators are shown to achieve a trade-oo among approximation error, estimation error and model complexity automatically without prior knowledge about the true regression function.
Minimal Penalties for Gaussian Model Selection
- Mathematics, Computer Science
- 2007
A precise analysis of what kind of penalties should be used in order to perform model selection via the minimization of a penalized least-squares type criterion within some general Gaussian framework including the classical ones is mainly devoted.
Gaussian model selection
- Computer Science
- 2001
Abstract.Our purpose in this paper is to provide a general approach to model selection via penalization for Gaussian regression and to develop our point of view about this subject. The advantage and…
Adaptive Model Selection Using Empirical Complexities
- Computer Science
- 1998
The estimates are shown to achieve a favorable tradeoff between approximation and estimation error, and to perform as well as if the distribution-dependent complexities of the model classes were known beforehand, when each model class has an infinite VC or pseudo dimension.
Model selection for regression on a random design
- Mathematics
- 2002
We consider the problem of estimating an unknown regression function when the design is random with values in . Our estimation procedure is based on model selection and does not rely on any prior…
Minimax nonparametric classification - Part II: Model selection for adaptation
- Mathematics, Computer ScienceIEEE Trans. Inf. Theory
- 1999
It is shown that with a suitable model selection criterion, the penalized maximum-likelihood estimator has a risk bounded by an index of resolvability expressing a good tradeoff among approximation error, estimation error, and model complexity.
Model selection in density estimation via cross-validation
- Mathematics
- 2008
The problem of model selection by cross-validation is addressed in the density estimation framework. Extensively used in practice, cross-validation (CV) remains poorly understood, especially in the…
References
SHOWING 1-10 OF 79 REFERENCES
Model selection for regression on a fixed design
- Mathematics, Computer Science
- 2000
This work considers some collection of finite dimensional linear spaces and the least-squares estimator built on a data driven selected model among this collection and deduce adaptivity properties from which the estimator from which it is derived holds under mild moment conditions on the errors.
Minimum complexity regression estimation with weakly dependent observations
- Mathematics
- 1994
The minimum complexity regression estimation framework, due to Barron, is a general data-driven methodology for estimating a regression function from a given list of parametric models using…
On the Estimation of a Probability Density Function by the Maximum Penalized Likelihood Method
- Mathematics
- 1982
Abstract : A class of probability density estimates can be obtained by penalizing the likelihood by a functional which depends on the roughness of the logarithm of the density. The limiting case of…
An asymptotic property of model selection criteria
- Computer Science, MathematicsProceedings of 1994 Workshop on Information Theory and Statistics
- 1994
The asymptotic risk of the density estimator is determined, under conditions on the penalty term, and is shown to be minimax optimal and the optimal rate of convergence is achieved for the density in certain smooth nonparametric families without knowing the smooth parameters in advance.
Adaptive Spline Estimates for Nonparametric Regression Models
- Mathematics
- 1993
where the are independent standart Gaussian random variables, while the regressors x are deterministic and equally spaced, i.e., x (2i-1)/(2n). We suppose that the unknown function f(.) is…
Rates of convergence for minimum contrast estimators
- Mathematics
- 1993
SummaryWe shall present here a general study of minimum contrast estimators in a nonparametric setting (although our results are also valid in the classical parametric case) for independent…
Minimum contrast estimators on sieves: exponential bounds and rates of convergence
- Mathematics
- 1998
This paper, which we dedicate to Lucien Le Cam for his seventieth birthday, has been written in the spirit of his pioneering works on the relationships between the metric structure of the parameter…
Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications
- Mathematics, Computer ScienceInf. Comput.
- 1992
Minimax risk overlp-balls forlp-error
- Mathematics
- 1994
SummaryConsider estimating the mean vector θ from dataNn(θ,σ2I) withlq norm loss,q≧1, when θ is known to lie in ann-dimensionallp ball,p∈(0, ∞). For largen, the ratio of minimaxlinear risk to minimax…
Wavelet Shrinkage: Asymptopia?
- Computer Science
- 1995
A method for curve estimation based on n noisy data: translate the empirical wavelet coefficients towards the origin by an amount √(2 log n) /√n and draw loose parallels with near optimality in robustness and also with the broad near eigenfunction properties of wavelets themselves.