Abstract

Imbalanced learning has become a research emphasis in recent years because of the growing number of class-imbalance classification problems in real applications. It is particularly challenging when the imbalanced rate is very high. Sampling, including under-sampling and over-sampling, is an intuitive and popular way in dealing with class-imbalance problems, which tries to regroup the original dataset and is also proved to be efficient. The main deficiency is that under-sampling methods usually ignore many majority class examples while over-sampling methods may easily cause over-fitting problem. In this paper, we propose a new algorithm dubbed KA-Ensemble ensembling under-sampling and over-sampling to overcome this issue. Our KA-Ensemble explores EasyEnsemble framework by under-sampling the majority class randomly and over-sampling the minority class via kernel based adaptive synthetic (Kernel-ADASYN) at meanwhile, yielding a group of balanced datasets to train corresponding classifiers separately, and the final result will be voted by all these trained classifiers. Through combining under-sampling and over-sampling in this way, KA-Ensemble is good at solving class-imbalance problems with large imbalanced rate. We evaluated our proposed method with state-of-the-art sampling methods on 9 image classification datasets with different imbalanced rates ranging from less than 2 to more than 15, and the experimental results show that our KA-Ensemble performs better in terms of accuracy (ACC), F-Measure, G-Mean, and area under curve (AUC). Moreover, it can be used in both dichotomy and multi-classification on both image classification and other class-imbalance problems.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.