Abstract

The class imbalance problem has attracted a lot of attention from the data mining community recently, becoming a current trend in machine learning research. The Consolidated Tree Construction (CTC) algorithm was proposed as an algorithm to solve a classification problem involving a high degree of class imbalance without losing the explaining capacity, a desirable characteristic of single decision trees and rule sets. CTC works by resampling the training sample and building a tree from each subsample, in a similar manner to ensemble classifiers, but applying the ensemble process during the tree construction phase, resulting in a unique final tree. In the ECML/PKDD 2013 conference the term “Inner Ensembles” was coined to refer to such methodologies. In this paper we propose a resampling strategy for classification algorithms that use multiple subsamples. This strategy is based on the class distribution of the training sample to ensure a minimum representation of all classes when resampling. This strategy has been applied to CTC over different classification contexts. A robust classification algorithm should not just be able to rank in the top positions for certain classification problems but should be able to excel when faced with a broad range of problems. In this paper we establish the robustness of the CTC algorithm against a wide set of classification algorithms with explaining capacity.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.