Abstract

Support vector machines (SVMs) are powerful statistical learning tools, but their application to large datasets can cause time-consuming training complexity. To address this issue, various instance selection (IS) approaches have been proposed, which choose a small fraction of critical instances and screen out others before training. However, existing methods have not been able to balance accuracy and efficiency well. Some methods miss critical instances, while others use complicated selection schemes that require even more execution time than training with all original instances, thus violating the initial intention of IS. In this work, we present a newly developed IS method called Valid Border Recognition (VBR). VBR selects the closest heterogeneous neighbors as valid border instances and incorporates this process into the creation of a reduced Gaussian kernel matrix, thus minimizing the execution time. To improve reliability, we propose a strengthened version of VBR (SVBR). Based on VBR, SVBR gradually adds farther heterogeneous neighbors as complements until the Lagrange multipliers of already selected instances become stable. In numerical experiments, the effectiveness of our proposed methods is verified on benchmark and synthetic datasets in terms of accuracy, execution time and inference time.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.