Abstract

The classification problem of imbalanced data is a popular issue in the field of machine learning in recent years. For imbalanced data, traditional classification algorithms tend to classify minority class samples into majority class, which result in the misclassification of many minority samples by the classifier. For imbalanced data classification problems, this paper proposes a Density Based Safe Level Synthetic Minority Oversampling TEchnique (DB-SLSMOTE). First, the algorithm clusters minority samples through Density-Based Spatial Clustering of Applications with Noise (DBSCAN). Then, the Safe Level Synthetic Minority Oversampling TEchnique (Safe-Level- SMOTE) is utilized for clusters of any shape discovered by DBSCAN. It is followed that the processed data is classified by Random Forest (RF). The experimental results show that the DB- SLSMOTE algorithm can effectively improve the classification performance of RF for minority samples in imbalanced data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call