Abstract

Imbalanced data classification with Random Forest Classification (RFC) technique has gained huge prominence in today’s application era. Data imbalance between practical applications relates to either binary class imbalance or multiclass imbalance. Binary class imbalance constitutes one of the classes with majority data samples and other contains minor number of data samples. In case of multiclass there are two categories of multiclass imbalanced dataset as Multiclass Minority Imbalanced Class (MMinIC) and Multiclass Majority Imbalanced Class (MMajIC). Classification performance leans towards degradation for MMajIC than MMinIC due to major imbalance rate severity. In this paper, the study investigates the influence of RFC classification analysis method on binary and multiclass sample imbalanced datasets. The analytical study of RFC incorporates with measurement of classification accuracy with performance metrics as True Positive (TP) Rate, False Positive (FP) Rate, Precision (Pre), Recall (Rec) F-Measure, Operating Characteristics of Receiver (ROC) Area, Matthews Correlation Coefficient (MCC), Probabilistic Relevance Classification (PRC) area with respect to numerous classes in refereed dataset. This paper focuses the reduction of the negative influence of imbalanced data with the use of Synthetic Minority Oversampling Technique (SMOTE). Experimental analysis carried out with the use of Knowledge Extraction Evolutionary Learning (KEEL) imbalanced data learning repository incorporating RFC classification with SMOTE technique. It also deals with RFC model construction with stage wise success rate calculation in training and testing partition and its impact on accuracy. The incorporates with error analysis report of incorrectly classified instances. Experimental results of the study indicate that imbalanced data have significant impact on classification accuracy and RFC outperforms with SMOTE.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call