Abstract

In this article, we study rates of convergence of the generalization error of multi-class margin classifiers. In particular, we develop an upper bound theory quantifying the generalization error of various large margin classifiers. The theory permits a treatment of general margin losses, convex or nonconvex, in presence or absence of a dominating class. Three main results are established. First, for any fixed margin loss, there may be a trade-off between the ideal and actual generalization performances with respect to the choice of the class of candidate decision functions, which is governed by the trade-off between the approximation and estimation errors. In fact, different margin losses lead to different ideal or actual performances in specific cases. Second, we demonstrate, in a problem of linear learning, that the convergence rate can be arbitrarily fast in the sample size n depending on the joint distribution of the input/output pair. This goes beyond the anticipated rate O(n−1). Third, we establish rates of convergence of several margin classifiers in feature selection with the number of candidate variables p allowed to greatly exceed the sample size n but no faster than exp(n).

Highlights

  • Large margin classification has seen significant developments in the past several years, including many well-known classifiers such as Support Vector Machine (SVM, (7)) and Neural Networks

  • The generalization accuracy of large margin classifiers has been investigated in two-class classification

  • Much less is known with regard to the generalization accuracy of large margin classifiers, its relation to presence/absence of a dominating class, which is not of concern in the two-class case

Read more

Summary

Introduction

Large margin classification has seen significant developments in the past several years, including many well-known classifiers such as Support Vector Machine (SVM, (7)) and Neural Networks. One major difficulty with this formulation is that the approximation error may dominate the corresponding estimation error and be non-zero This occurs in classification with linear decision functions; see Section 5.1 for an example. In such a situation, well-established bounds for the estimation error become irrelevant, and that such a learning theory breaks down when the approximation error does not tend to zero. To treat the multi-class margin classification, and circumvent the aforementioned difficulty, we take a novel approach by targeting at Regret(f, f V ) with f V the risk minimizer over F given V Toward this end, we study the ideal generalization performance of f V and the mean-variance relationship of the cost function.

Multi-class and generalized margin losses
Ideal generalization performance
Actual generalization performance
Linear classification
Multi-class linear classification
Nonlinear classification
Feature selection
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.