Abstract
Traditional classification tasks suffer from the class-imbalanced problem, where some classes far outnumber others. To address this issue, existing class-imbalanced learning (CIL) methods either preprocess class-imbalanced datasets or adapt traditional classification algorithms to the imbalanced class distribution. Inspired by the idea of transductive learning, we propose a post-processing framework called PPF for CIL. Distinct from existing CIL methods, PPF directly adjusts the predicted labels of test data to fit the imbalanced class distribution. Specifically, we relabel some test data according to their prediction probabilities so that the class proportion of test data is close to that of training data. The underlying assumption is that training and test data, drawn independently from one data space, should obey the same class distribution. Furthermore, we propose a Compact Prototype-based Nearest Neighbor (CPNN) algorithm to assist the original classifier with the adjustment. Instead of training a classifier, CPNN classifies test data according to their distances to a set of prototypes estimated on labeled data. Thus, it is computationally simple and relatively robust to class imbalance. As a general framework, PPF can be easily applied to both traditional classification and CIL algorithms. To validate the effectiveness of the proposed method, we conducted extensive experiments on a variety of class-imbalanced datasets, using SVM and C4.5 as the original classifiers, respectively. Measured by F-measure, G-mean, and AUC, both PPF-SVM and PPF-C4.5 outperform 10 state-of-the-art CIL algorithms. Additionally, PPF further improved their performances when applied to 10 CIL algorithms.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.