Abstract

Imbalanced data classification is the fundamental problem of data mining. Relevant researchers have proposed many solutions to solve the problem, such as sampling and ensemble learning methods. However, random under-sampling is easy to lose representative samples, and ensemble learning does not use the correlation information between pieces in the data set. Therefore, we proposed a Hybrid Adaptive sampling with Bagging Classifier(HABC). Specifically, we calculated the adaptive sampling rate according to the characteristics of the data set. We then performed density-based under-sampling and over-sampling on the original data set according to the sampling rate. Further, the sampled data subset was sent to the Bagging classifier, and the classifier was employed to predict the unknown data set. In addition, the multi-objective particle swarm optimization algorithm was combined to optimize the prediction result. Extensive experiments based on UCI, KEEL, and three bioinformatics datasets show that our proposed method is better than state-of-the-art algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call