Abstract

Accuracy and diversity are considered to be the two deriving factors when it comes to generating an ensemble classifier. Focusing only on accuracy causes the ensemble classifier to suffer from “diminishing returns” and the ensemble accuracy tends to plateau; whereas focusing only on diversity causes the ensemble classifier to suffer in accuracy. Therefore, a balance must be maintained between the two for the ensemble classifier to achieve high classification accuracy. In this paper, we propose a novel diversity measure known as Misclassification Diversity (MD) and an Incremental Layered Classifier Selection (ILCS) approach to generate an ensemble classifier. The proposed approach ILCS-MD generates an ensemble classifier by incrementally selecting classifiers from the base classifier pool based on increasing accuracy and diversity. The benefits are in two folds 1) the generated ensemble classifier contains only those classifiers from the pool which can either maximize accuracy whilst maintaining or increasing the diversity, and 2) the generated ensemble classifier selects only a few classifiers from the base classifier pool thus reducing ensemble component size as well. The proposed approach is evaluated on 55 benchmark datasets taken from UCI and KEEL dataset repositories. The results are compared with five existing pairwise diversity measures, and existing state of the art ensemble classifier approaches. A significance test is also conducted to verify the significance of the results.

Highlights

  • Ensemble classifiers known as ‘‘multi-classifier systems’’ are machine learning classification methods that are used to get better predictive performance over a single classifier

  • We propose a novel pairwise diversity measure which computes diversity using the misclassification labels of two classifiers and a novel incremental layered classifier selection approach

  • We introduced a novel pairwise diversity measure and a novel incremental layered classifier selection approach which selects classifiers in each layer based on the new diversity measure to generate an ensemble classifier

Read more

Summary

INTRODUCTION

Ensemble classifiers known as ‘‘multi-classifier systems’’ are machine learning classification methods that are used to get better predictive performance over a single classifier. Diversity and accuracy have been pointed out to be the two deriving factors when generating an ensemble classifier [4], [5]. Accuracy is the ability of a classifier to generate class labels as close to the ground truth as possible, and diversity is the difference between the classification abilities of various classifiers in the ensemble. In this paper we generate an ensemble classifier by maximizing diversity and accuracy. A novel pairwise diversity measure is proposed This diversity measure is used to determine whether a classifier should be selected from the pool to form the ensemble or not. A novel incremental classifier selection approach is proposed to generate an ensemble classifier using the proposed diversity measure.

BACKGROUND
Findings
CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.