Abstract

This paper focuses on the problem of how data representation influences the generalization error of kernel based learning machines like support vector machines (SVM) for classification. Frame theory provides a well founded mathematical framework for representing data in many different ways. We analyze the effects of sparse and dense data representations on the generalization error of such learning machines measured by using leave-one-out error given a finite amount of training data. We show that, in the case of sparse data representations, the generalization error of an SVM trained by using polynomial or Gaussian kernel functions is equal to the one of a linear SVM. This is equivalent to saying that the capacity of separating points of functions belonging to hypothesis spaces induced by polynomial or Gaussian kernel functions reduces to the capacity of a separating hyperplane in the input space. Moreover, we show that, in general, sparse data representations increase or leave unchanged the generalization error of kernel based methods. Dense data representations, on the contrary, reduce the generalization error in the case of very large frames. We use two different schemes for representing data in overcomplete systems of Haar and Gabor functions, and measure SVM generalization error on benchmarked data sets.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.