Clustering analysis has been widely used in statistics, machine learning, pattern recognition, image processing, and so on. It is a great challenge for most existing clustering algorithms to discover clusters with arbitrary shapes. Clustering algorithms based on Minimum spanning tree (MST) are able to discover clusters with arbitrary shapes, but they are time consuming and susceptible to noise points. In this paper, we employ local density peaks (LDP) to represent the whole data set and define a shared neighbors-based distance between local density peaks to better measure the dissimilarity between objects on manifold data. On the basis of local density peaks and the new distance, we propose a novel MST-based clustering algorithm called LDP-MST. It first uses local density peaks to construct MST and then repeatedly cuts the longest edge until a given number of clusters are found. The experimental results on synthetic data sets and real data sets show that our algorithm is competent with state-of-the-art methods when discovering clusters with complex structures.
Read full abstract