Abstract

We proposed to develop a heterogeneous ensemble method that boost the performance of random forest classifier. The proposed method utilized the boosting capability of adaboost algorithm and the feature selection and bagging capability of random subspace algorithm. Both algorithms used random forest as the base classifier and were combined using voting methodology. We preprocessed the dataset by removing both redundant features and the detected outliers associated with dataset features. We addressed the multiclass problem of the dataset by decomposing the dataset into binary classes using the technique of 1 against 1 enhanced by pairwise coupling. Since supervised algorithms are designed to be biased with majority class, we overcome that challenge by generating synthetic instances using synthetic minority over sampling technique. The proposed SMOTE_Voted outlier ensemble method outperformed Random Forest, KNN, NaiveBayes, C4.5 and Support Vector machine outlier detection methods. We conclude that ensemble technique improves performance of outlier detection for multiclass problem.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.