Abstract

The Fuzzy C-means (FCM) algorithm is one of the most widely used algorithms in unsupervised pattern recognition. As the intensity of observation noise increases, FCM tends to produce large center deviations and even overlap clustering problems. The relative entropy fuzzy C-means algorithm (REFCM) adds relative entropy as a regularization function to the fuzzy C-means algorithm, which has a good ability for noise detection and membership assignment to observed values. However, REFCM still tends to generate overlapping clusters as the size of the cluster increases and becomes imbalanced. Moreover, the convergence speed of this algorithm is slow. To solve this problem, modified suppressed relative entropy fuzzy c-means clustering (MSREFCM) is proposed. Specifically, the MSREFCM algorithm improves the convergence speed of the algorithm while maintaining the accuracy and anti-noise capability of the REFCM algorithm by adding a suppression strategy based on the intra-class average distance measurement. In addition, to further improve the clustering performance of MSREFCM for multidimensional imbalanced data, the center overlapping problem and the center offset problem of elliptical data are solved by replacing the Euclidean distance in REFCM with the Mahalanobis distance. Experiments on several synthetic and UCI datasets indicate that MSREFCM can improve the convergence speed and classification performance of the REFCM for spherical and ellipsoidal datasets with imbalanced sizes. In particular, for the Statlog dataset, the running time of MSREFCM is nearly one second less than that of REFCM, and the accuracy of MSREFCM is 0.034 higher than that of REFCM.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call