Abstract
Support Vector Machine (SVM) is an effective classifier for classification task, but a vital shortcoming of SVM is that it needs huge computation for large-scale learning tasks. Sample selection is a feasible strategy to overcome the problem. In order to reduce training samples without sacrificing recognition accuracy, this paper presents a novel sample selection approach named Kernel Subclass Convex Hull (KSCH) sample selection approach, which tries to select boundary samples of each class convex hull. The sample selection idea is derived from the geometrical explanation of SVM. In geometry, constructing a SVM problem can be converted to a problem of computing the nearest points between two convex hulls. Therefore, each class convex hull virtually determines the separating plane of SVM. Since a convex hull of a set can be only constructed by boundary samples of the convex hull, using boundary samples of each class to train SVM will be equivalent to using all training samples to train the classifier. Based on the idea, KSCH method iteratively select boundary samples of each class convex hull in high-dimensional space (induced by kernel trick). The convex hull of chosen set is called subclass convex hull. With the increasing of the size of chosen set, each subclass convex hull can rapidly approximate each class convex hull. So the samples selected by our method can efficiently represent original training set and support SVM classification. Experimental results on MIT-CBCL face database and UMIST face database show that KSCH sample selection method can select fewer high-quality samples to maintain the recognition accuracy of SVM.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.