A Hybrid Algorithm to Improve the Accuracy of Support Vector Machines on Skewed Data-Sets

Jair Cervantes,Farid García-Lamont,De-Shuang Huang,Asdrúbal López-Chau

doi:10.1007/978-3-319-09333-8_85

Jair Cervantes, Farid García-Lamont + Show 2 more

Open Access

https://doi.org/10.1007/978-3-319-09333-8_85

Copy DOI

Abstract

AbstractOver the past few years, has been shown that generalization power of Support Vector Machines (SVM) falls dramatically on imbalanced data-sets. In this paper, we propose a new method to improve accuracy of SVM on imbalanced data-sets. To get this outcome, firstly, we used undersampling and SVM to obtain the initial SVs and a sketch of the hyperplane. These support vectors help to generate new artificial instances, which will take part as the initial population of a genetic algorithm. The genetic algorithm improves the population in artificial instances from one generation to another and eliminates instances that produce noise in the hyperplane. Finally, the generated and evolved data were included in the original data-set for minimizing the imbalance and improving the generalization ability of the SVM on skewed data-sets.KeywordsSupport Vector MachinesHybridImbalanced

Full Text