A survey of cross-validation procedures for model selection

  title={A survey of cross-validation procedures for model selection},
  author={Sylvain Arlot and Alain Celisse},
  journal={Statistics Surveys},
Used to estimate the risk of an estimator or to perform model selection, cross-validation is a widespread strategy because of its simplicity and its apparent universality. Many results exist on the model selection performances of cross-validation procedures. This survey intends to relate these results to the most recent advances of model selection theory, with a particular emphasis on distinguishing empirical statements from rigorous theoretical results. As a conclusion, guidelines are provided… Expand
Cross-Validation, Risk Estimation, and Model Selection
Cross-validation is a popular non-parametric method for evaluating the accuracy of a predictive rule. The usefulness of cross-validation depends on the task we want to employ it for. In this note, IExpand
A Review of Cross Validation and Adaptive Model Selection
A review of model selection procedures, in particular various cross validation procedures and adaptive model selection are performed, and the connections between different procedures and information criteria are explored. Expand
Cross-validation for selecting a model selection procedure
Results are provided on how to apply CV to consistently choose the best method, yielding new insights and guidance for potentially vast amount of application. Expand
On the usefulness of cross-validation for directional forecast evaluation
The results show that in such a situation with small samples the cross-validation scheme may have considerable advantages over the standard out-of-sample evaluation procedure as it may help to overcome problems induced by the limited information the directional accuracy measures contain due to their binary nature. Expand
Model selection criteria based on cross-validatory concordance statistics
This work presents the development and investigation of three model selection criteria based on cross-validatory analogues of the traditional and adjusted c-statistics designed to estimate three corresponding measures of predictive error, and shows that these estimators serve as suitable models selection criteria. Expand
Model selection for estimation of causal parameters
A popular technique for selecting and tuning machine learning estimators is cross-validation. Cross-validation evaluates overall model fit, usually in terms of predictive accuracy. This may lead toExpand
Best subset selection via cross-validation criterion
The purpose of this paper is to establish a mixed-integer optimization approach to selecting the best subset of explanatory variables via the cross-validation criterion, which can be formulated as a bilevel MIO problem and reduced to a single-level mixed- integer quadratic optimization problem. Expand
An efficient variance estimator for cross-validation under partition sampling
  • Qing Wang, Xizhen Cai
  • Mathematics
  • Statistics
  • 2021
This paper concerns the problem of variance estimation of cross-validation. We consider the unbiased cross-validation risk estimate in the form of a general U-statistic and focus on estimating theExpand
Cross-Validation, Risk Estimation, and Model Selection: Comment on a Paper by Rosset and Tibshirani
How best to estimate the accuracy of a predictive rule has been a longstanding question in statistics. Approaches to this task range from simple methods like Mallow’s Cp to algorithmic techniquesExpand
The connection between cross-validation and Akaike information criterion in a semiparametric family
Both Akaike information criterion and cross-validation are important tools in model selection. Stone [(1977), ‘An Asymptotic Equivalence of Choice of Model by Cross-Validation and Akaikes Criterion’,Expand


Linear Model Selection by Cross-validation
We consider the problem of model selection in the classical regression model based on cross-validation with an additional penalty term for penalizing overfitting.Under a given assumption,the newExpand
Unified Cross-Validation Methodology For Selection Among Estimators and a General Cross-Validated Adaptive Epsilon-Net Estimator: Finite Sample Oracle Inequalities and Examples
In Part I of this article we propose a general cross-validation criterian for selecting among a collection of estimators of a particular parameter of interest based on n i.i.d. observations. It isExpand
Estimating the Error Rate of a Prediction Rule: Improvement on Cross-Validation
Abstract We construct a prediction rule on the basis of some data, and then wish to estimate the error rate of this rule in classifying future observations. Cross-validation provides a nearlyExpand
Model Selection Via Multifold Cross Validation
In model selection, it is known that the simple leave one out cross validation method is apt to select overfitted models. In an attempt to remedy this problem, we consider two notions of multi-foldExpand
Linear Model Selection by Cross-validation
Abstract We consider the problem of selecting a model having the best predictive ability among a class of linear models. The popular leave-one-out cross-validation method, which is asymptoticallyExpand
An alternative method of cross-validation for the smoothing of density estimates
Cross-validation with Kullback-Leibler loss function has been applied to the choice of a smoothing parameter in the kernel method of density estimation. A framework for this problem is constructedExpand
Robust Linear Model Selection by Cross-Validation
A robust algorithm for model selection in regression models using Shao's cross-validation methods for choice of variables as a starting point is provided, demonstrating a substantial improvement in choosing the correct model in the presence of outliers with little loss of efficiency at the normal model. Expand
Theoretical developments on cross validation (CV) have mainly focused on selecting one among a list of finite-dimensional models (e.g., subset or order selection in linear regression) or selecting aExpand
On the use of cross-validation to assess performance in multivariate prediction
We describe a Monte Carlo investigation of a number of variants of cross-validation for the assessment of performance of predictive models, including different values of k in leave-k-outExpand
A local cross-validation algorithm
The usuall form of cross-validation is global in character, and is designed to estimate a density in some "average" sense over its entire support. In this paper we present a local version ofExpand