An Adaptive Density-Sensitive Similarity Measure Based Spectral Clustering Algorithm and Its Parallelization

Gen Zhang,Changyun Li,Kun Gong,Lanjun Wan,Mansheng Xiao

doi:10.1109/access.2021.3111156

Abstract

The clustering effect of the spectral clustering algorithm depends on the calculation of the similarity between samples. Although a better clustering effect of the spectral clustering algorithm can be obtained using the Gaussian kernel function to calculate the similarity between samples, it relies on the setting of the kernel parameter. Therefore, an adaptive density-sensitive similarity measure based spectral clustering (DSSC) algorithm is proposed for improving the clustering effect. Specifically, firstly, the Euclidean distances between samples are calculated to get the nearest neighbors of each sample. Secondly, the standard deviation of distances between each sample and its nearest neighbors is calculated as the density parameter. Thirdly, the density-sensitive distances between each sample and its nearest neighbors are calculated. Finally, the similarities between each sample and its nearest neighbors are calculated to construct a similarity matrix. In addition, the proposed DSSC algorithm is parallelized on Dask distributed parallel computing platform with CPU+GPU, which can improve the computational efficiency of the DSSC algorithm by taking full advantage of the CPU and GPU resources. A series of experiments are conducted to verify the effectiveness of the proposed DSSC algorithm on several synthetic datasets and UCI datasets, and the results show that the DSSC algorithm not only achieves satisfactory clustering results, but also obtains better efficiency of performing large-scale clustering analysis.

Highlights

The clustering algorithm [1] is one of the unsupervised learning algorithms commonly used for data mining, and its purpose is to divide the samples of the same class into the same cluster as many as possible
Yang et al [15] proposed a spectral clustering algorithm based on density sensitive similarity, which uses an adjustable line segment length measure method to calculate the distances between samples to construct a similarity matrix, and a random matrix is constructed based on the Markov chain
An adaptive density-sensitive similarity measure based spectral clustering algorithm is proposed, which can better calculate the similarities between samples and their nearest neighbors to improve the clustering effect to a certain extent

Summary

INTRODUCTION

The clustering algorithm [1] is one of the unsupervised learning algorithms commonly used for data mining, and its purpose is to divide the samples of the same class into the same cluster as many as possible. Zhang et al [14] proposed a spectral clustering algorithm based on local density adaptive similarity, which adopts the common near neighbor measure method to construct a similarity matrix. Yang et al [15] proposed a spectral clustering algorithm based on density sensitive similarity, which uses an adjustable line segment length measure method to calculate the distances between samples to construct a similarity matrix, and a random matrix is constructed based on the Markov chain. The above research can effectively reduce the running time of the spectral clustering algorithm, how to fully utilize all available computing resources of a cluster to improve the efficiency of performing large-scale clustering analysis is still a challenge. An adaptive density-sensitive similarity measure based spectral clustering algorithm is proposed, which can better calculate the similarities between samples and their nearest neighbors to improve the clustering effect to a certain extent.

OVERVIEW OF NJW ALGORITHM

THE PROPOSED DSSC ALGORITHM

PARALLELIZATION OF THE DSSC ALGORITHM

ANALYSIS OF TIME COMPLEXITY

ANALYSIS OF COMPUTATIONAL EFFICIENCY

Worker Nodes

Findings

CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2021
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

An Adaptive Density-Sensitive Similarity Measure Based Spectral Clustering Algorithm and Its Parallelization

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Simulated annealing spectral clustering algorithm for image segmentation
Yifang Yang ... Yuping Wang
Journal of Systems Engineering and Electronics | VOL. 25
Yifang Yang, et. al.Yifang Yang ... Yuping Wang
01 Jun 2014
Journal of Systems Engineering and Electronics | VOL. 25

KNN-SC: Novel Spectral Clustering Algorithm Using k-Nearest Neighbors
Jeong-Hun Kim ... Carson Kai-Sang Leung
IEEE Access | VOL. 9
Jeong-Hun Kim, et. al.Jeong-Hun Kim ... Carson Kai-Sang Leung
01 Jan 2020
IEEE Access | VOL. 9

Local density adaptive similarity measurement for spectral clustering
Xianchao Zhang ... Hong Yu
Pattern Recognition Letters | VOL. 32
Xianchao Zhang, et. al.Xianchao Zhang ... Hong Yu
30 Sep 2010
Pattern Recognition Letters | VOL. 32

Regularized spectral clustering under the mixed membership stochastic block model
Huan Qing ... Jingli Wang
Neurocomputing | VOL. 550
Huan Qing, et. al.Huan Qing ... Jingli Wang
26 Jun 2023
Neurocomputing | VOL. 550

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Adaptive Density-Sensitive Similarity Measure Based Spectral Clustering Algorithm and Its Parallelization

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access