An Improved K-Means Clustering Algorithm Based on Semantic Model

Zhe Liu,Fei Ding,Jianmin Bao

doi:10.1145/3148453.3306269

Abstract

K-means algorithm is one of the most influential clustering algorithms in the field of data mining. It is widely used in many fields, such as school, daily consumption, transfer, and curriculum arrangement of different student groups. However, the traditional k-means algorithm is relatively sensitive to the initial cluster center, and the clustering result is excessively dependent on the initial center. In order to obtain a more accurate clustering result, we proposes a k-means algorithm based on semantic improvement. In this paper, we calculate the mesh density of the sample, set the density threshold to remove the outliers, and divide the core points, boundary points, noise points, optimized clusters according to the grid density of the data points, effectively reduce noise interference, and build the semantic relationship of the data in the cluster and optimizes the selection of the initial cluster center point. The simulation experiment is carried out by using five common datasets provided by the UCI database. The results show that the search method based on the improved k-means algorithm is reduced in the data iteration time compared with the prior art. Improvements have been made in terms of accuracy.

Full Text