Robust clustering by detecting density peaks and assigning points based on fuzzy weighted K-nearest neighbors

Juanying Xie,Hongchao Gao,Weixin Xie,Xiaohui Liu,Philip W Grant

doi:10.1016/j.ins.2016.03.011

Abstract

Clustering by fast search and find of Density Peaks (referred to as DPC) was introduced by Alex Rodríguez and Alessandro Laio. The DPC algorithm is based on the idea that cluster centers are characterized by having a higher density than their neighbors and by being at a relatively large distance from points with higher densities. The power of DPC was demonstrated on several test cases. It can intuitively find the number of clusters and can detect and exclude the outliers automatically, while recognizing the clusters regardless of their shape and the dimensions of the space containing them. However, DPC does have some drawbacks to be addressed before it may be widely applied. First, the local density ρi of point i is affected by the cutoff distance dc, and is computed in different ways depending on the size of datasets, which can influence the clustering, especially for small real-world cases. Second, the assignment strategy for the remaining points, after the density peaks (that is the cluster centers) have been found, can create a “Domino Effect”, whereby once one point is assigned erroneously, then there may be many more points subsequently mis-assigned. This is especially the case in real-word datasets where there could exist several clusters of arbitrary shape overlapping each other. To overcome these deficiencies, a robust clustering algorithm is proposed in this paper. To find the density peaks, this algorithm computes the local density ρi of point i relative to its K-nearest neighbors for any size dataset independent of the cutoff distance dc, and assigns the remaining points to the most probable clusters using two new point assignment strategies. The first strategy assigns non-outliers by undertaking a breadth first search of the K-nearest neighbors of a point starting from cluster centers. The second strategy assigns outliers and the points unassigned by the first assignment procedure using the technique of fuzzy weighted K-nearest neighbors. The proposed clustering algorithm is benchmarked on publicly available synthetic and real-world datasets which are commonly used for testing the performance of clustering algorithms. The clustering results of the proposed algorithm are compared not only with that of DPC but also with that of several well known clustering algorithms including Affinity Propagation (AP), Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and K-means. The benchmarks used are: clustering accuracy (Acc), Adjusted Mutual Information (AMI) and Adjusted Rand Index (ARI). The experimental results demonstrate that our proposed clustering algorithm can find cluster centers, recognize clusters regardless of their shape and dimension of the space in which they are embedded, be unaffected by outliers, and can often outperform DPC, AP, DBSCAN and K-means.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Robust clustering by detecting density peaks and assigning points based on fuzzy weighted K-nearest neighbors

Abstract

Talk to us

Similar Papers

More From: Information Sciences

Lead the way for us

Journal: Information Sciences	Publication Date: Mar 12, 2016
Citations: 300

Similar Papers

Clustering by Detecting Density Peaks and Assigning Points by Similarity-First Search Based on Weighted K-Nearest Neighbors Graph
Qi Diao ... Weixing Li
Complexity | VOL. 2020
Qi Diao, et. al.Qi Diao ... Weixing Li
12 Aug 2020
Complexity | VOL. 2020

Shared-nearest-neighbor-based clustering by fast search and find of density peaks
Rui Liu ... Xiaomei Yu
Information Sciences | VOL. 450
Rui Liu, et. al.Rui Liu ... Xiaomei Yu
20 Mar 2018
Information Sciences | VOL. 450

SDW-DPC: An Advanced Clustering Algorithm by Searching Density Peaks using Standard Deviation Weighted Distance
Juanying Xie ... Xinglin Liu
-
Juanying Xie, et. al.Juanying Xie ... Xinglin Liu
26 Nov 2022
26 Nov 2022

A method of two-stage clustering learning based on improved DBSCAN and density peak algorithm
Mingyang Li ... Xuming Han
Computer Communications | VOL. 167
Mingyang Li, et. al.Mingyang Li ... Xuming Han
06 Jan 2021
Computer Communications | VOL. 167

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Robust clustering by detecting density peaks and assigning points based on fuzzy weighted K-nearest neighbors

Abstract

Talk to us

Similar Papers

More From: Information Sciences