Abstract

Density Peaks Clustering (DPC) tries to use two objectives: density and peaks, to automatically determine the number of clusters. It is claimed to be applicable to data sets with non-spherical clusters. However, the cutoff distance dc in DPC should be determined based on the experience of decision maker and the cluster centers should be selected manually. But it is very difficult to do so and improper selection of these will result in incorrect results. In order to overcome these shortcomings, an adaptive cutoff distance computing method based on Gini index is proposed firstly, and then the possibility (i.e., multiplication of the local density and the relative distance y=ρiδi) of each point xi as a cluster center is calculated, moreover, the point with the maximal change of possibility is determined as the critical point. Each point whose possibility is larger than that of the critical point will be a cluster center. In this way, both the number of clusters and cluster centers can be automatically determined, and the manually selecting the cluster centers through the decision graph in DPC can be avoided. Based on these, a new density peak clustering algorithm by automatically determining both the number of clusters and cluster centers is proposed. Finally, experiments are conducted and the results show that the new algorithm can not only automatically determine the cluster center, but also has higher accuracy than DPC.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call