Abstract

This paper focuses on imbalanced dataset classification problem by using SVM and oversampling method. Traditional oversampling method increases the occurrence of over-lapping between classes, which leads to poor generalization of SVM classification. To solve this problem this paper proposes a combined method of quasi-linear SVM and assembled SMOTE. The quasi-linear SVM is an SVM with quasi-linear kernel function. It realizes an approximate nonlinear separation boundary by mulit-local linear boundaries with interpolation. The assembled SMOTE implements oversampling with considering of the data distribution information and avoids occurrence of overlapping between classes. Firstly, a partition method based on Minimal Spanning Tree is proposed to obtain local linear partitions, each of which can be separated with one linear separation boundary. Secondly, using the information of local linear partitions, the assembled SMOTE generates synthetic minority class samples. Finally, the quasi-linear SVM realizes a classification of oversampled datasets in the same way as a standard SVM by using a composite quasi-linear kernel function. Experiment results on artificial data and benchmark datasets show that the proposed method is effective and improves classification performances.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.