Abstract

Density peaks clustering (DPCluster) algorithm published in the journal Science in 2014 is a novel clustering algorithm based on density, which is simple and efficient. However, in this algorithm, the selection of its parameter cut-off distance d c is subjective, which could lead to the poor accuracy of clustering results. Besides, the assignment strategy DPCluster proposes could not identify the datasets with manifold structure well. To overcome these defects, an optimized clustering algorithm (named as KIK-DPCluster) that joints K-distance graph with Iterative KNN (K-Nearest Neighbor algorithm) is proposed in this paper. We introduce the idea of K-distance graph to define the parameter d c in order to separate cluster cores and halos accurately. When assigning the points to the proper clusters, we divide the original strategy into three stages by combining the original algorithm with KNN and putting forward the thought of ‘result of the existing label’. Firstly, the points that satisfy the constraint are assigned. Then, KNN is used to assign the rest points. Finally, we utilize the existing label information to re-determine the result by using KNN iteratively, which is similar to ‘semi-supervised’ algorithm. It is presented that KNN algorithm plays the role of the assignment and the correction in the above process. Experiments on publicly available synthetic and real-world datasets show that the proposed clustering algorithm outperforms DPCluster when identifying the datasets with manifold structure. Besides, compared with the classic clustering algorithm, KIK-DPCluster increases accuracy of the clustering results by 13%~14%. In the end, applying KIK-DPCluster to Weibo check-in dataset presents that the clustering result is consistent with actual situation, thus our algorithm has great practical value.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.