Abstract

Many sampling-based preprocessing methods have been proposed to solve the problem of unbalanced dataset classification. The fundamental principle of these methods is rebalancing an unbalanced dataset by a concrete strategy. Herein, we introduce a novel hybrid proposal named ant colony optimization resampling (ACOR) to overcome class imbalance classification. ACOR primarily includes two steps: first, it rebalances an imbalanced dataset by a specific oversampling algorithm; next, it finds an (sub)optimal subset from the balanced dataset by ant colony optimization. Unlike other oversampling techniques, ACOR does not focus on the mechanics of generating new samples. The main advantage of ACOR is that existing oversampling algorithms can be fully utilized and an ideal training set can be obtained by ant colony optimization. Therefore, ACOR can enhance the performance of existing oversampling algorithms. Experimental results on 18 real imbalanced datasets prove that ACOR yields significantly better results compared with four popular oversampling methods in terms of various assessment metrics, such as AUC, G-mean, and BACC.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call