Abstract

The traditional K-means algorithm is sensitive to the initial clustering center, and randomly selecting different initial clustering centers will result in different clustering results. In this paper, an improved K-means algorithm based on Mean Shift clustering is proposed to solve the existing problems of the K-means algorithm. This algorithm selects a high-density migration vector set MP by Mean Shift, and selects k points with the farthest distance from each other in the high-density region in MP as the initial cluster center. This paper adopts the iris data set and the wine data set from the international standard UCI database, and 150 vowel image texts on the upper part of the baseline for the text analysis of the Ujin body Tibetan ancient books are used to verify the proposed algorithm (The real sample is called the Tibetan dataset). It can be seen from the experimental results that the algorithm can achieve better clustering results with higher accuracy and more stability.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call