Abstract

The improved fuzzy c-means (IFCM) algorithm is an effective technique for handling the “uniform effect” in imbalanced data clustering; it adjusts the weight of each class based on the fuzzy size between clusters. However, the IFCM algorithm produces a “siphon effect” as the imbalance rate increases. It misclassifies the samples in small classes into large ones. Our analysis shows that this effect occurs because all samples have the same weight value of the same classes, the membership values are polarized, resulting in the model failing to converge to the correct interval. Thus, we propose an imbalanced fuzzy c-means clustering based on edge modification (EM-IFCM) algorithm to alleviate the “siphon effect” of the IFCM algorithm. It exhibits stronger inter-class separability by dynamically adjusting the weight of the samples to enhance the influence of edge samples on the model. In addition, we analyze the effectiveness and complexity of the algorithm and proved its convergence. Finally, we conduct extensive experiments on synthesis, machine-learning, and image-segmentation datasets and compare the results with those of six algorithms. The experimental results show that EM-IFCM has higher accuracy and exhibits an imbalance rate that is at least 1.94 times higher than that of the other algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call