Abstract

Features in data samples usually need a unified dimension by a standardization process before clustering. However, there still exists a nonstandardized metric in which the distance between samples is greater than 1 after features are standardized. It is difficult to find the optimal search path if the data sample metrics are not standardized. To address this problem, we develop a dynamic-metric accelerated method for fuzzy clustering by introducing a metric matrix, whose diagonal elements consist of infinite norms of the metric matrix into the Fuzzy C-Means (FCM) clustering algorithm and its derived algorithms. More specifically, we focus on constructing a dynamic metric matrix that is used to unify the metric between data samples and updating cluster centers to optimize the search path of the cluster center. In addition, we propose a new evaluation index named the Coefficient of Variation Metric (CVM) to evaluate metric effectiveness. The dynamic metric accelerated method, whose complexity remains unchanged, can effectively accelerate the iteration speed of fuzzy clustering. The comparisons between the algorithm using the dynamic metric accelerated method and the corresponding algorithm on UCI, business district and COVID-19 CT image datasets show the superiority of the dynamic metric accelerated method in accelerating effect and clustering performance.

Highlights

  • The dynamic metric accelerated method, whose complexity remains unchanged, can effectively accelerate the iteration speed of fuzzy clustering

  • Relative entropy measures the difference of data from the perspective of probability distribution and uncertainty [51][52][53]; it has the advantage of being insensitive to common noises [54][55].Euclidean distance is the most popular metric in objective functions of fuzzy clustering because it can reflect the real distance in the sample space[46]

  • The Mahalanobis distance can better reflect the correlation between data samples, so the process of fuzzy clustering is not affected by the feature dimensions, but the calculation of the inverse matrix of the covariance matrix in Mahalanobis distance greatly increases the complexity of the calculation[48][49][50]

Read more

Summary

TABLE I INFLUENCE OF M VALUE ON ITERATIONS

M value/number of iterations Euclidean metric Dynamic metric m=2 m=3 m=4 m=5 m=6 m=7. Purity( ,C) / iter to evaluate the improvement level of the clustering accuracy of the algorithm: Purity( ,C) / iter 1 n k max j

INFLUENCE OF M VALUE ON PURITY
Traditional metric
After acceleration by the dynamic metric in Figure
Dynamic metric accelerated method
The time complexity of calculating Euclidean distance is
Iris with data further from the center
Seed Thyroid
EXpEriMEntaL rEsULts anaLysis
Assisted Multiobjective Kernel Intuitionistic Fuzzy Clustering
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call