SMOTE for Handling Imbalanced Data Problem : A Review

Gede Angga Pradipta,I Nyoman Hariyasa Sanjaya,Muhammad Ismail,Aina Musdholifah,Retantyo Wardoyo

doi:10.1109/icic54025.2021.9632912

Abstract

Imbalanced class data distribution occurs when the number of examples representing one class is much lower than others. This conditioning affects the prediction accuracy degraded on minority data. To overcome this problem, Synthetic Minority Oversampling Technique (SMOTE) is a pioneer oversampling method in the research community for imbalanced classification. The basic idea of SMOTE is oversampled by creating a synthetic instance in feature space formed by the instance and its K-nearest neighbors due to the ability to avoid overfitting and assist the classifier in finding decision boundaries between classes. In this paper, we review current issue and problem occurs in classification with imbalanced data, performance evaluation in imbalanced data, a survey on an extension of SMOTE in recent years, and finally identify current challenges and future work in learning with imbalanced data.

Full Text