Abstract

The traditional k-means algorithm has randomness in the selection of the initial clustering center, which is easy to fall into local optimum. In order to select the appropriate initial clustering center to obtain the optimal clustering effect, an optimized initial clustering center algorithm based on density and dimension weighting is proposed. Euclidean distance after dimension weighting was used to construct dissimilarity matrix, and mean dissimilarity and overall dissimilarity were calculated. Introducing Gaussian kernel function to determine initial cluster centers. Through the test on UCI dataset, compared with the classical k-means algorithm,k-means++ algorithm and related improved algorithms,the clustering indexes are optimized and improved, with high feasibility. High time consumption occurs in high dimensional data computation. The proposed algorithm has less iterations and better clustering performance on high-dimensional datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call