Abstract

The One-versus-all(OVA) approach is one of the mainstream decomposition methods by which multiple binary classifiers are used to solve multiclass classification tasks. However, it exists the problems of serious class imbalance. This paper proposes a differential partition sampling ensemble method(DPSE) in the OVA framework. The number of majority samples and that of the minority samples in each binary training dataset are used as the upper and lower limits of the sampling interval respectively. Within this range, the construction process of the arithmetic sequence is simulated to generate the set containing multiple different sampling numbers with equal intervals. All samples are divided into safe examples, borderline examples, rare examples, and outliers according to the neighborhood information, then Random undersampling for safe samples(s-Random undersampling) and SMOTE for borderline examples and rare examples (br-SMOTE) are proposed based on the distribution characteristics of the classes. In each iteration, according to the number of differential sampling, the two methods are used to undersample or oversample the majority and minority in each binary training dataset to balance the number of positive and negative samples, which preserves the characteristic of the class structure as much as possible. Balanced training sets are used to train the binary classification model with multiple sub classifiers. The thorough experiments performed on 27 KEEL public multiclass datasets show that DPSE outperforms the typical methods in the OVA scheme, the One-versus-One scheme or direct way in classification performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.