Abstract

Spectral clustering has become one of the main clustering methods and has a wide range of applications. Similarity measure is crucial to correct cluster separation for spectral clustering. Many existing spectral clustering algorithms typically measure similarity based on the undirected k-Nearest Neighbor (kNN) graph or Gaussian kernel function, which can not reveal the real clusters of not well-separated data sets. In this paper, we propose a novel algorithm called Spectral Clustering based on Shared Nearest Neighbors (SC-SNN) to improve the clustering quality of not well-separated data sets. Instead of using distance for the similarity measure, the proposed SC-SNN algorithm measures the similarity by considering the closeness of shared nearest neighbors in the directed kNN graph, which is able to explore the underlying similarity relationships between data points and is robust to the not well-separated data sets. Moreover, SC-SNN has only one parameter, k, and is less sensitive than the spectral clustering algorithms based on the undirected kNN graph. The proposed SC-SNN algorithm is evaluated by using both synthetic and real-world data sets. The experimental results demonstrate that SC-SNN not only achieves good performance, but also outperforms the traditional spectral clustering algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call