Clustering algorithms on imbalanced data using the SMOTE technique for image segmentation

Wajira Abeysinghe,Slim Bechikh,Chih-Cheng Hung,Altaf Rattani,Xiaosong Wang

doi:10.1145/3264746.3264774

Wajira Abeysinghe, Slim Bechikh + Show 3 more

https://doi.org/10.1145/3264746.3264774

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Imbalanced data is a critical problem in machine learning. Most imbalanced dataset consists of one or more classes, called the minority class, which do not have enough number of samples for the recognition. Many traditional classification algorithms are unable to recognize the minority class effectively. Clustering algorithms used for image segmentation may have a high accuracy; however, none of samples in the minority class is classified correctly. In this study, we use three approaches, traditional oversampling technique, traditional undersampling technique, and the Synthetic Minority Over-sampling Technique (SMOTE), to reduce the significant difference of imbalance of the number of samples between the majority classes and the minority classes in the dataset. Fuzzy C-means algorithm (FCM) and Possibilistic Clustering Algorithm (PCA) are used to segment the images in which the samples are generated using above sampling methods. Experimental results are evaluated using the Kappa Coefficient and Confusion matrix. Our evaluation shows that the oversampling, undersampling, and SMOTE techniques can improve the imbalanced image segmentation problem with a higher accuracy[1].

Full Text