Abstract

Clustering based on local density peaks and graph cut (LDP-SC) is one of the state-of-the-art algorithms in unsupervised clustering, which first divides the data set to be multiple local trees, and then aggregates these local trees to obtain the final clustering result. However, for complex data sets, there might exist data points from different classes in the same local tree. In this article, we use pairwise constraint information to resolve this issue and propose a semi-supervised local density peaks and graph cut based clustering algorithm (SLDPC). In particular, SLDPC proposes intra-cluster conflict resolution and inter-cluster conflict resolution steps to split the local trees which are inconsistent with the provided pairwise constraint information. Theoretically, we show that the two steps will finish in a finite number of operations and the split local trees will be consistent with the pairwise constraint information. Subsequently, root node redirection and noise filtering steps are designed to avoid the local trees becoming too fragmented. Finally, we exploit the E2CP algorithm to further improve the similarity matrix between local trees using the pairwise constraint information, and the spectral clustering algorithm is adopted to obtain the clustering result. Experiments on multiple widely used synthetic and real-world data sets show that SLDPC is superior to LDP-SC and several other semi-supervised prominent clustering algorithms for most of the cases.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call