Asymptotic comparison of (partial) cross-validation, GCV and randomized GCV in nonparametric regression

Didier A Girard

doi:10.1214/aos/1030563988

Abstract

When using nonparametric estimates of the mean curve, surface or image underlying noisy observations, the selection of smoothing parameters is generally crucial. This paper gives a theoretical comparison of the performances of generalized cross-validation (GCV) and of its fast randomized version (RGCV), as selection criteria. This is mainly done by studying the asymptotic distribution of the excess error for each selector, that is, the difference between the (data-driven) resulting average squared error (ASE) and the best possible ASE. We show here that, by using randomization, this distribution is dilated, as compared to that for CV or GCV, only by a factor always lower than $1 + 1/n_R$, where $n_R$ is the number of primary randomized trace estimates one uses in RGCV. We include in the compared selectors, the partial cross-validation (PCV) approach where only a fraction of all the possible leave-one-out validation tests are evaluated; so that PCV is a common practice to reduce the computational cost in many contexts. In this paper, PCV will in fact appear as quite inefficient as compared to RGCV from this computational point of view. Moreover, we show that a precise comparison (and interpretation of the gain of using $n_R \geq 2$) is possible in terms of equivalent (in distribution) excess errors, if PCV uses a certain percentage of the test points greater than 50%. The obtained comparisons will be seen as quite reassuring on what is sacrificed in using randomized selectors. We give rigorous results mainly for the kernel regression setting as in the previous detailed study by Hardle, Hall and Marron of standard selectors, except that we do not restrict this one to an equidistant design.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: The Annals of Statistics	Publication Date: Feb 1, 1998
Citations: 19	License type: implied-oa

R Discovery Prime

R Discovery Prime

Asymptotic comparison of (partial) cross-validation, GCV and randomized GCV in nonparametric regression

Abstract

Talk to us

Similar Papers

More From: The Annals of Statistics

Lead the way for us

Similar Papers

Generalized Cross Validation (GCV) in Smoothing Spline Nonparametric Regression Models
M Maharani ... D R S Saputro
Journal of Physics: Conference Series | VOL. 1808
M Maharani, et. al.M Maharani ... D R S Saputro
01 Mar 2021
Journal of Physics: Conference Series | VOL. 1808

Practical use of robust GCV and modified GCV for spline smoothing
Mark A Lukas ... Robert S Anderssen
Computational Statistics | VOL. 31
Mark A Lukas, et. al.Mark A Lukas ... Robert S Anderssen
21 Apr 2015
Computational Statistics | VOL. 31

On bandwidth selection problems in nonparametric trend estimation under martingale difference errors
Karim Benhenni ... Didier A Girard
Bernoulli | VOL. 28
Karim Benhenni, et. al.Karim Benhenni ... Didier A Girard
01 Feb 2022
Bernoulli | VOL. 28

Fast Stable Restricted Maximum Likelihood and Marginal Likelihood Estimation of Semiparametric Generalized Linear Models
Simon N Wood
Journal of the Royal Statistical Society Series B: Statistical Methodology | VOL. 73
Simon N WoodSimon N Wood
14 Sep 2010
Journal of the Royal Statistical Society Series B: Statistical Methodology | VOL. 73

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Asymptotic comparison of (partial) cross-validation, GCV and randomized GCV in nonparametric regression

Abstract

Talk to us

Similar Papers

More From: The Annals of Statistics