Abstract

The traditional K-means algorithm has been widely used in cluster analysis. However, the algorithm only involves the distance factor as the only constraint, so there is a problem of sensitivity to special data points. To address this problem, in the process of K-means clustering, ambiguity is introduced as a new constraint condition. Hence, a new membership Equation is proposed on this basis, and a method for solving the initial cluster center points is given, so as to reduce risks caused by random selection of initial points. Besides, an optimized clustering algorithm with Gaussian distribution is derived with the utilization of fuzzy entropy as the cost function constraint. Compared with the traditional clustering method, the new Equation’s membership degree can reflect the relationship between a certain point and the set in a clearer way, and solve the problem of the traditional K-means algorithm that it is prone to be trapped in local convergence and easily influenced by noise. Experimental verification proves that the new method has fewer iterations and the clustering accuracy is better than other methods, thus having a better clustering effect.

Highlights

  • The clustering process is the most effective classification method for people to summarize complex external information [1]

  • Geng et al.: An Improved K-means Algorithm Based on Fuzzy Metrics other algorithms to realize the selection of clustering centers and the determination of the distance function In [16], it is proposed to use the result of Singular Value Decomposition (SVD ) decomposition as the initial point of clustering to obtain better clustering effects

  • This paper proposes an improved K-means algorithm based on fuzzy entropy

Read more

Summary

INTRODUCTION

The clustering process is the most effective classification method for people to summarize complex external information [1]. X. Geng et al.: An Improved K-means Algorithm Based on Fuzzy Metrics other algorithms to realize the selection of clustering centers and the determination of the distance function In [16], it is proposed to use the result of Singular Value Decomposition (SVD ) decomposition as the initial point of clustering to obtain better clustering effects. In reference [33], a clustering method featuring the combination of fuzzy c-means algorithm and entropy- based algorithm is advised to achieve both distinct and compact effects. The algorithm first introduces artificial setting of the initial cluster center to reduce the influence of noise, integrates the overall distribution structure into that of the membership function, and compares the overall ambiguity of the cluster after introducing a certain point. The last step is the convergence completed through iteration, realizing the clustering of the FMK-means algorithm

K-MEANS ALGORITHM
MEMBERSHIP AND FUZZY METRICS
OPTIMIZATION DIRECTION
FMK-MEANS ALGORITHM DERIVATION
INITIAL CENTER POINT SOLUTION OF FMK-MEANS
THE FLOW OF FMK-MEANS ALGORITHM
FMK-MEANS ALGORITHM COMPLEXITY
EXPERIMENTAL ENVIRONMENT AND DESIGN
EXPERIMENTAL RESULTS AND ANALYSIS
Findings
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call