Abstract
This paper concerns the problem of variance estimation of cross-validation. We consider the unbiased cross-validation risk estimate in the form of a general U-statistic and focus on estimating the variance of the U-statistic risk score. We propose an efficient variance estimator under a half-sampling design where the bias of the estimator can be expressed explicitly. Furthermore, we discuss a practical approach to estimate its bias by a two-layer Monte Carlo method so as to obtain a bias-corrected variance estimator. In the simulation study and real data examples, we evaluate the performance of the proposed variance estimator, in comparison to the commonly used bootstrap and jackknife methods, in the context of model selection under the one-standard-error rule. The numerical results suggest that the proposal yields identical or similar conclusion for model selection compared to its counterparts. Moreover, the developed variance estimator is much more efficient to calculate than its competitors. In the end, we discuss the generalization of the methodology to other partition-sampling scenarios.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.