Abstract

Uneven density data refers to data with a certain difference in sample density between clusters. The local density of density peaks clustering algorithm (DPC) does not consider the effect of sample density difference between clusters of uneven density data, which may lead to wrong selection of cluster centers; the algorithm allocation strategy makes it easy to incorrectly allocate samples originally belonging to sparse clusters to dense clusters, which reduces clustering efficiency. In this study, we proposed the density peaks clustering algorithm based on fuzzy and weighted shared neighbor for uneven density datasets (DPC-FWSN). First, a nearest neighbor fuzzy kernel function is obtained by combining K-nearest neighbor and fuzzy neighborhood. Then, local density is redefined by the nearest neighbor fuzzy kernel function. The local density can better characterize the distribution characteristics of the sample by balancing the contribution of sample density in dense and sparse areas, in order to avoid the situation that the sparse cluster does not have a cluster center. Finally, the allocation strategy for weighted shared neighbor similarity is proposed to optimize the sample allocation at the boundary of the sparse cluster. Experiments are performed on IDPC-FA, FKNN-DPC, FNDPC, DPCSA and DPC for uneven density datasets, complex morphologies datasets and real datasets. The clustering results demonstrate that DPC-FWSN effectively handles datasets with uneven density distribution.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call