The application of improved imbalanced learning based on fuzzy C-means clustering (FCM) and SMOTE

Haider Mustafa Mueen,A Ghazikhani,Yasir Abdul Zahra Flaiyh Alaabedi

doi:10.1063/5.0073576

Abstract

Data classification is one of the most applicable branches of pattern recognition and data mining science. Its wide range of application can be easily seen in everyday life. In the last few years, major changes have been occurred in data classification technology. Since the field of technology application has been increased, the size of information has been increased as well. Data classification has become difficult due to the unlimited size and imbalanced nature of the data. Data classification with imbalanced class distribution has caused a significant defect in the performance of standard classification learning algorithms, which assume that the data class distribution is relatively balanced. This paper presents a simple and effective sampling method based on Fuzzy C-means Clustering (FCM) and SMOTE (Synthetic Minority Oversampling Technique) that prevent noise generation and effectively resolve imbalance between classes. The evaluation of experiments shows that the proposed technique effectively reduces noise production. The obtained results for the accuracy of the proposed method indicate that it has been improved by an average of two percent compared to the base paper in different datasets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

The application of improved imbalanced learning based on fuzzy C-means clustering (FCM) and SMOTE

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Prediction and optimization of employee turnover intentions in enterprises based on unbalanced data.
Zhaotian Li ... Suja A Alex
PloS one | VOL. 18
Zhaotian Li, et. al.Zhaotian Li ... Suja A Alex
17 Aug 2023
PloS one | VOL. 18

A Gaussian mixture based boosted classification scheme for imbalanced and oversampled data
Biprodip Pal ... Mahit Kumar Paul
-
Biprodip Pal, et. al.Biprodip Pal ... Mahit Kumar Paul
01 Feb 2017
01 Feb 2017

Leveraging on Synthetic Data Generation Techniques to Train Machine Learning Models for Tenaga Nasional Berhad Stock Price Movement Prediction
Nur Aliah Syahmina Mohd ... Nor Hapiza Mohd Ariffin
The International Arab Journal of Information Technology | VOL. 21
Nur Aliah Syahmina Mohd, et. al.Nur Aliah Syahmina Mohd ... Nor Hapiza Mohd Ariffin
01 Jan 2024
The International Arab Journal of Information Technology | VOL. 21

Cost-sensitive boosting for classification of imbalanced data
Yanmin Sun ... Yang Wang
Pattern Recognition | VOL. 40
Yanmin Sun, et. al.Yanmin Sun ... Yang Wang
05 May 2007
Pattern Recognition | VOL. 40

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The application of improved imbalanced learning based on fuzzy C-means clustering (FCM) and SMOTE

Abstract

Talk to us

Similar Papers