Abstract

DPC (Clustering by fast search and find of Density Peaks) algorithm and its variations typically employ Euclidean distance, overlooking the diverse contributions of individual feature to similarity and subsequent clustering. To address this limitation, the standard deviation weighted distance is proposed in this paper to enhance the Euclidean distance. This weighted distance takes into account the specific contribution of each feature to the distance (similarity) between data points. By utilizing this weighted distance, the local density ρi and distance δi of point i are defined, thereby capturing the local pattern of point i to the fullest extent possible. Outliers are defined using this innovative weighted distance. The divide and conquer assignment strategy is proposed based on this proposed weighted distance and the semi-supervised learning and the mutual K-nearest neighbor assumption. Consequently, the SFKNN-DPC (Standard deviation weighted distance and Fuzzy weighted K-Nearest Neighbors based Density Peak Clustering) algorithm is proposed, aiming to effectively uncover the hidden clusters within a dataset. Extensive experiments conducted on benchmark datasets demonstrate the superiority of SFKNN-DPC over DPC, its variations, and other benchmark clustering algorithms. Moreover, statistical significance tests indicate that SFKNN-DPC exhibits notable differences when compared to its counterparts.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call