Abstract

Class-imbalance learning is one of the most challenging problems in machine learning. As a new and important direction in this field, multi-class imbalanced data classification has attracted a great many research focus in recent years. In this paper, we first make a very comprehensive review on state-of-the-art classification algorithms for multi-class imbalanced data. Moreover, we propose a new multi-class imbalance classification algorithm, which is hereafter referred to as the Diversified Error Correcting Output Codes (DECOC) method. The main idea of DECOC is to combine the improved ECOC (Error Correcting Output Codes) method for tackling class imbalance, and the diversified ensemble learning framework, which finds the best classification algorithm (out of many heterogeneous classification algorithms) for each individual sub-dataset resampled from the original data. We conduct experiments on 19 public datasets to empirically compare the performance of DECOC with 17 state-of-the-art multi-class imbalance learning algorithms, using 4 different accuracy measures: overall accuracy, Geometric mean, F-measure, and Area Under Curve. Experimental results demonstrate that DECOC achieves significantly better accuracy performance than the other 17 algorithms on these accuracy metrics. To advance research in this field, we make all the source codes of DECOC and the above-mentioned 17 state-of-the-art algorithms for imbalanced data classification be available at GitHub: https://github.com/chongshengzhang/Multi_Imbalance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.