Abstract

Imbalanced learning for classification problems is the active area of research in machine learning. Many classification systems like image retrieval and credit scoring systems have imbalanced distribution of training data sets which causes performance degradation of the classifier. Re-sampling of imbalanced data is commonly used to handle imbalanced distribution as it is independent of the classifier being used. But sometimes they can remove necessary data of the class or can cause over-fitting. Classifier Ensembles have recently achieved more attention as effective technique to handle skewed data.The focus of the work is to gain advantages of both data level and classifier ensemble approach in order to improve the classification performance. We present a novel approach that initially applies pre-processing to the imbalanced dataset in order to reduce the imbalance between the classes. The pre-processed data is provided as training dataset to the classifier ensemble that introduces diversity by using different training datasets as well as different classifier models. The experimentation conducted on the eight imbalanced datasets from KEEL repository helps to prove the significance of the proposed method. A comparative analysis shows the performance improvement in terms of Area under ROC Curve (AUC).

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.