An Improved K-Means Clustering Algorithm Based on Density Selection

Wenhao Xie,Xiaoyan Wang,Bowen Xu

doi:10.1007/978-3-030-62746-1_88

Abstract

K-means clustering algorithm is an unsupervised learning method with simple principles, easy implementation, and strong adaptability. Aiming at the disadvantages of this algorithm that the clusters’ number is difficult to determine, sensitive to the initial cluster center, and the clustering result is easily impacted by the outliers, this paper proposes an improved clustering algorithm based on density selection, which compares the neighborhood density of each sample and the average density of all the samples, treats the samples with lower density as the outliers or isolated points, and then deletes them. After data pre-processing, the cluster validity index is modified to obtain the optimal clusters’ number by minimizing the cluster validity index, and then optimizes the initial cluster center by density selection strategy. Finally, it is verified by the experiment that the improved algorithm has better accuracy than the traditional K-means algorithm, and it can converge to the global minimum of SSE faster.

Full Text