A novel clustering algorithm based on PageRank and minimax similarity

Qidong Liu,Yunyun Liu,Rongjing Hu,Zhili Zhao,Ruisheng Zhang,Xin Liu

doi:10.1007/s00521-018-3607-x

Abstract

Clustering by fast search and find of density peaks (herein called FDPC), as a recently proposed density-based clustering algorithm, has attracted the attention of many researchers since it can recognize arbitrary-shaped clusters. In addition, FDPC needs only one parameter $$d_c$$ and identifies the number of clusters by decision graph. Nevertheless, it is not clear how to find a proper $$d_c$$ for a given data set and such a perfect parameter may not exist in practice for the multi-scale data set. In this paper, we proposed a modified PageRank algorithm to compute the local density for each data point which is more robust than Gaussian kernel and cutoff method. Besides, FDPC yields poor results on the random distribution data sets since there may be several maxima for one cluster. To solve this problem, we proposed an improved minimax similarity method. Comparing our proposed approach with FDPC on some artificial and real-life data sets, the experimental results indicate that our proposed approach outperforms FDPC in terms of accuracy.

Full Text