Abstract

This paper presents a simulation-based empirical study of the performance profile of random sub sample ensembles with a hybrid mix of base learner composition in high dimensional feature spaces. The performance of hybrid random sub sample ensemble that uses a combination of C4.5, k-nearest neighbor (kNN) and naïve Bayes base learners is assessed through statistical testing in comparison to those of homogeneous random sub sample ensembles that employ only one type of base learner. Simulation study employs five datasets with up to 20K features from the UCI Machine Learning Repository. Random sub sampling without replacement is used to map the original high dimensional feature space of the five datasets to a multiplicity of lower dimensional feature subspaces. The simulation study explores the effect of certain design parameters that include the count of base classifiers and sub sampling rate on the performance of the hybrid random subspace ensemble. The ensemble architecture utilizes the voting combiner in all cases. Simulation results indicate that hybridization of base learners for random sub sample ensemble improves the prediction accuracy rates and projects a more robust performance.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.