Abstract

Support vector machines (SVMs) are hyperplane classifiers defined in a kernel induced feature space. The high computational and space requirements for solving the conventional SVM problem prohibit its use in applications involving large datasets. Core vector machine (CVM) is a suitable technique for scaling an SVM for large-scale pattern classification problems. But in applications where the datasets are unbalanced, the performance of CVM is observed to be poor both in terms of generalisation and training time. In such scenarios, the CVM performance highly depends on the orderings of data points belonging to the two classes within the dataset. In this paper, we propose two training schemes which improve the performance of CVM irrespective of the orderings of patterns belonging to different classes within the dataset. These methods employ a selective sampling-based training of CVM using novel kernel-based clustering algorithms. Empirical studies made on several synthetic and real world datasets show that the proposed strategies improve the performance of CVM on large datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call