Abstract

Most of the classification problems in real word suffer from the problem of class skewness. Various methods are required to handle the classification of problems having class skewness. Several methods have been proposed for this purpose which includes some ensemble methods. Most of such ensemble methods targets on creating balanced sub-problems like UnderBagging based kernelized Extreme Learning Machine (UBKELM), UnderBagging based Reduced Kernel Extreme Learning Machine (UBRKELM), Random Undersampling Boost (RUSBoost), EasyEnsemble and BalanceCascade. This work suggests that apart from class skewness, there are many other factors like class overlapping, length of decision boundary and the number of probability distributions present in the problem which are responsible for performance degradation of a classifier. This work proposes an ensemble method which decomposes a complex imbalanced problem into simpler sub-problems, solves these sub-problems using cost sensitive classifiers and then combines the results of each classifier using voting methods. A classification problem can have mixture distribution, the complexity of the problem increases with increase in the number of probability distributions present in it. The proposed problem decomposition method try to create less complex sub-problems by decreasing the number of distributions present in the sub-problems than the original problem. The proposed method uses a clustering evaluation algorithm to find the optimal number of sub-problems. For decomposing an imbalanced classification problem into sub-problems this work uses fuzzy clustering algorithm (FCM) which is a soft clustering algorithm, i.e. the overlapping between the sub-problems depends on the selected value of threshold (TH) parameter in the FCM algorithm. The sub-problems created in this work may or may not be balanced, so Weighted Kernelized Extreme Learning Machine (WKELM) is used to create the classifiers for these sub-problems. The final prediction of the ensemble of these classifiers is determined using soft voting and majority voting. The proposed method is evaluated on 38 benchmark binary class imbalanced datasets downloaded from KEEL dataset repository. The obtained results show that the proposed method outperforms other state of the art methods of imbalance classification. The Wilcoxon signed rank test is performed to show the significant improvement in results.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call