Abstract

Model selection in kernel methods is the problem of choosing an appropriate hypothesis space for kernel-based learning algorithms to avoid either underfitting or overfitting of the resulting hypothesis. One of main problems faced by model selection is how to control the sample complexity when designing the model selection criterion. In this paper, we take balls of reproducing kernel Hilbert spaces (RKHSs) as candidate hypothesis spaces and propose a novel model selection criterion via minimizing the empirical optimal error in the ball of RKHS and the covering number of the ball. By introducing the covering number to measure the capacity of the ball of RKHS, our criterion could directly control the sample complexity. Specifically, we first prove the relation between expected optimal error and empirical optimal error in the ball of RKHS. Using the relation as the theoretical foundation, we give the definition of our criterion. Then, by estimating the expectation of optimal empirical error and proving an upper bound of the covering number, we represent our criterion as a functional of the kernel matrix. An efficient algorithm is further developed for approximately calculating the functional so that the fast Fourier transform (FFT) can be applied to achieve a quasi-linear computational complexity. We also prove the consistency between the approximate criterion and the accurate one for large enough samples. Finally, we empirically evaluate the performance of our criterion and verify the consistency between the approximate and accurate criterion.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.