Abstract

We focus on the problem of high computational complexity in the clustering process of traditional spectral clustering algorithm that cannot satisfy the requirement of current large-scale data clustering applications. In this article, we establish a constrained optimal propagation based semi-supervised large-scale data clustering model. In this model, micro similarity matrix is constructed by using prior dotted pair constraint information at first. On this basis, the Gabow algorithm is exploited to extract each strongly connected component from the micro similarity matrix that is represented by its connected graph. Then, a new constrained optimisation propagation algorithm for each strongly connected component is proposed to calculate the similarity of the whole dataset. Finally, we employ the singular value decomposition and the accelerated k-means algorithm to obtain the clustering results of large-scale data. Experiments on multiple standard testing datasets show that compared with other previous research results in this field, the proposed clustering model has higher clustering accuracy and lower computation complexity, and is more suitable for large-scale data clustering applications.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call