Abstract

In response to the need to generate synthetic minority class samples to extend minority classes, the SMOTE-based oversampling methods have been favored for class-imbalanced classification. They usually generate unnecessary noise when training data are not well separated. Although filtering-based oversampling methods are recognized as effective solutions for addressing noise generation through employing specific noise filters based on instance selection methods to remove suspicious noise, they suffer from the following issues: a) noise filters heavily rely on strong assumptions, causing low robustness to different datasets; b) noise filters are specially designed for a specific oversampling method, and are not easily extended to others; and c) noise filters have a relatively high time consumption. To address noise generation while overcoming the above issues a)-c), an oversampling framework based on sample subspace optimization with accelerated binary particle swarm optimization (OF-SSO-ABPSO) is proposed. OF-SSO-ABPSO is a wrapping framework compatible with almost all the oversampling methods. First, in the framework, a SMOTE-based method is used to generate synthetic minority class samples. Second, a novel accelerated binary particle swarm optimization (ABPSO) algorithm with a new search space reduction strategy, a new particle update mechanism, and a new fitness function is proposed. Third, a novel ABPSO-based sample subspace optimization (SSO-ABPSO) method is proposed and used as a noise filter to remove suspicious noise from the training set and synthetic minority class samples. Experiments prove that, a) OF-SSO-ABPSO can improve 6 representative SMOTE variations by addressing noise generation, and b) OF-SSO-ABPSO outperforms 7 state-of-the-art filtering-based oversampling methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.