Abstract
In view of the fact that most clustering algorithms cannot solve the clustering problem about samples with uncertain information, according to the theory of fuzzy sets and probability, we define the fuzzy-probability binary measure space and triangular fuzzy normal random variables firstly, and then combine the advantages of k-means algorithm, such as simple principle, few parameters, fast convergence rate, good clustering effect and good scalability, etc., a clustering algorithm is proposed for samples containing multiple triangular fuzzy normal random variables, which we call TFNRV-k-means algorithm. The algorithm uses our proposed Euclidean random comprehensive absolute distance (ERCAD for short) as a measurement, under the fuzzy measure, the lower bound, the principal value and the upper bound of the triangular fuzzy normal random variables are iterated, respectively, by means, and then the cluster center is updated until it becomes stable and unchanged. Then we analyze the time complexity of the proposed algorithm, and test the algorithm under different sample sets by random simulation experiments. We get the highest clustering accuracy of 99.00% and the maximum Kappa coefficient of 0.9850, and draw the conclusion that TFNRV-k-means clustering algorithm has good clustering effect. Finally, we summarize the content of the article, list the advantages and disadvantages of TFNRV-k-means clustering algorithm, and propose corresponding improvement methods, which provide ideas for further research on TFNRV-k-means in the future.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have