Abstract

Many real world applications inevitably contain datasets that have multiclass structure characterized by imbalance classes, redundant and irrelevant features that degrade performance of classifiers. Minority classes in the datasets are treated as outliers’ classes. The research aimed at establishing the role of ensemble technique in improving performance of multiclass classification. Multiclass datasets were transformed to binary and the datasets resampled using Synthetic minority oversampling technique (SMOTE) algorithm. Relevant features of the datasets were selected by use of an ensemble filter method developed using Correlation, Information Gain, Gain-Ratio and ReliefF filter selection methods. Adaboost and Random subspace learning algorithms were combined using Voting methodology utilizing random forest as the base classifier. The classifiers were evaluated using 10 fold stratified cross validation. The model showed better performance in terms of outlier detection and classification prediction for multiclass problem. The model outperformed other well-known existing classification and outlier detection algorithms such as Naive bayes, KNN, Bagging, JRipper, Decision trees, RandomTree and Random forest. The study findings established that ensemble technique, resampling datasets and decomposing multiclass results in an improved classification performance as well as enhanced detection of minority outlier (rare) classes. Keywords: Multiclass, Classification, Outliers, Ensemble, Learning Algorithm DOI : 10.7176/JIEA/9-5-04 Publication date : August 31 st 2019

Highlights

  • The issue of multiclass classification has attracted a lot of researcher’s interest due to its challenges and wide application in real life

  • The minority classes in a multiclass datasets may be described as rare classes or rare events or outliers (Chawla, 2009)

  • Proposed Method We proposed development of an ensemble multiclass classification and outlier detection method for data mining

Read more

Summary

Introduction

The issue of multiclass classification has attracted a lot of researcher’s interest due to its challenges and wide application in real life. The minority classes in a multiclass datasets may be described as rare classes or rare events or outliers (Chawla, 2009). In a multiclass imbalance problem, the rare classes’ forms the class of interest since the existing classification algorithms were designed with bias towards prediction of majority classes (Athimethphat & Lerteerawong, 2012). The problem to learn in such conditions constitutes most biases particular to several learning algorithms which are the most significant in some real applications such as biological data analysis, image classification, text classification, and web page classification. Several strategies and techniques are required when solving the problem of multiclass. According to Elkano et al, (2017) decomposition strategies have been demonstrated to be a successful methodology for multiclass classification

Objectives
Methods
Results
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.