Optimal cross-validation in density estimation with the $L^{2}$-loss

  title={Optimal cross-validation in density estimation with the \$L^\{2\}\$-loss},
  author={Alain Celisse},
  journal={Annals of Statistics},
  • Alain Celisse
  • Published 5 November 2008
  • Mathematics
  • Annals of Statistics
We analyze the performance of cross-validation (CV) in the density estimation framework with two purposes: (i) risk estimation and (ii) model selection. The main focus is given to the so-called leave-$p$-out CV procedure (Lpo), where $p$ denotes the cardinality of the test set. Closed-form expressions are settled for the Lpo estimator of the risk of projection estimators. These expressions provide a great improvement upon $V$-fold cross-validation in terms of variability and computational… 

Figures from this paper

Choice of V for V-Fold Cross-Validation in Least-Squares Density Estimation
A non-asymptotic oracle inequality is proved for V-fold cross-validation and its bias-corrected version (V-fold penalization), implying that V- fold penalization is asymptotically optimal in the nonparametric case.
Theoretical Analysis of Cross-Validation for Estimating the Risk of the $k$-Nearest Neighbor Classifier
A general strategy to derive moment and exponential concentration inequalities for the L$p$O estimator applied to the $k-nearest neighbors ($k$NN) rule in the context of binary classification is described.
Local asymptotics of cross-validation in least-squares density estimation
In model selection, several types of cross-validation are commonly used and many variants have been introduced. While consistency of some of these methods has been proven, their rate of convergence
Estimating the Kullback–Liebler risk based on multifold cross‐validation
This paper concerns a class of model selection criteria based on cross‐validation techniques and estimative predictive densities. Both the simple or leave‐one‐out and the multifold or leave‐m‐out
Bias-aware model selection for machine learning of doubly robust functionals
An oracle property is established for a multi-fold cross-validation version of the new model selection criteria which states that the empirical criteria perform nearly as well as an oracle with a priori knowledge of the pseudo-risk for each candidate model.
Targeted Cross-Validation
This work proposes a targeted cross-validation (TCV) to select models or procedures based on a general weighted L2 loss and shows that the TCV is consistent in selecting the best performing candidate under the weighted L1 loss.
New upper bounds on cross-validation for the k-Nearest Neighbor classification rule
A new strategy to derive bounds on moments of the leave-pout estimator used to assess the performance of the kNN classifier is provided and these moment upper bounds are used to settle a new exponential concentration inequality for binary classification.
Learning high-dimensional probability distributions using tree tensor networks
We consider the problem of the estimation of a high-dimensional probability distribution using model classes of functions in tree-based tensor formats, a particular case of tensor networks associated
Evolutionary cross validation
An evolutionary cross validation algorithm for identifying optimal folds in a dataset to improve predictive modeling accuracy is proposed and results of experimental evaluation suggest that the proposed algorithm provides significant improvement against the baseline 10 fold cross validation.
Asymptotic Properties of a Class of Criteria for Best Model Selection
  • V. Stepashko
  • Mathematics
    2020 IEEE 15th International Conference on Computer Sciences and Information Technologies (CSIT)
  • 2020
The paper investigates the asymptotic convergence of some typical criteria for model selection from a given data sample. A range of known criteria are generalized into a special class joining two


Model selection via cross-validation in density estimation, regression, and change-points detection
A fully resampling-based procedure is proposed, which enables to deal with the hard problem of heteroscedasticity, while keeping a reasonable computational complexity.
Nonparametric density estimation by exact leave-p-out cross-validation
Asymptotics of cross-validated risk estimation in estimator selection and performance assessment
Linear Model Selection by Cross-validation
Abstract We consider the problem of selecting a model having the best predictive ability among a class of linear models. The popular leave-one-out cross-validation method, which is asymptotically
It is shown that under some conditions, with an appropriate choice of data splitting ratio, cross validation is consistent in the sense of selecting the better procedure with probability approaching 1.
Risk bounds for model selection via penalization
It is shown that the quadratic risk of the minimum penalized empirical contrast estimator is bounded by an index of the accuracy of the sieve, which quantifies the trade-off among the candidate models between the approximation error and parameter dimension relative to sample size.
A leave-p-out based estimation of the proportion of null hypotheses
In the multiple testing context, a challenging problem is the estimation of the proportion �0 of true-null hypotheses. A large number of estimators of this quantity rely on identifiability
Adaptive Model Selection Using Empirical Complexities
The estimates are shown to achieve a favorable tradeoff between approximation and estimation error, and to perform as well as if the distribution-dependent complexities of the model classes were known beforehand, when each model class has an infinite VC or pseudo dimension.
Model Selection and Error Estimation
A tight relationship between error estimation and data-based complexity penalization is pointed out: any good error estimate may be converted into a data- based penalty function and the performance of the estimate is governed by the quality of the error estimate.
An alternative method of cross-validation for the smoothing of density estimates
An alternative method of cross-validation, based on integrated squared error, recently also proposed by Rudemo (1982), is derived, and Hall (1983) has established the consistency and asymptotic optimality of the new method.