Abstract

Clustering, which explores the visualization and distribution of data, has recently been widely studied. Although current clustering algorithms such as DBSCAN, can detect the arbitrary-shape clusters and work well, the parameters involved in these methods are often difficult to determine. Clustering using a fast search of density peaks is a promising technique for solving this problem. However, the current methods suffer from the problem of uneven distribution within local clusters. To solve this problem, we propose a new density peak based clustering algorithm employing a hierarchical strategy, namely, HCFS, which consists mainly of two stages. In the first stage, the HCFS estimates the density and distance of each point. The points with higher density and distance are selected as candidate centers, and then subclusters centered on them are further obtained. In the second stage, considering that adjacent subclusters based on certain candidate centers are highly similar and connected within the same cluster, we propose a new mechanism for measuring dissimilarity and connectivity between the subclusters. Those highly similar and connected subclusters are merged to increase the dissimilarity between different clusters and to obtain the final clustering results. The experiments conducted on a large number of datasets show that our method can effectively identify unevenly distributed clusters and yield better or comparable performance for different datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call