Fast spectral clustering with self-adapted bipartite graph learning

Xiaojun Yang,Mingjun Zhu,Yongda Cai,Zheng Wang,Feiping Nie

doi:10.1016/j.ins.2023.03.035

Abstract

Spectral Clustering (SC) is a widespread used clustering algorithm in data mining, image processing, etc. It is a graph-based algorithm capable of handling arbitrarily distributed data. However, the distances of all samples in the high-dimensional space tend to be equal, so the similarity matrix of SC may not be reasonable in high-dimensional data. In addition, the similarity matrix and clustering results in SC are performed in two steps. To solve these problems, a novel joint clustering method called fast spectral clustering with self-adapted bipartite graph learning (FSBGL), is proposed. It is capable of obtaining low-dimensional representations from high dimensional data, thus decreasing the complexity and increase the efficiency of the algorithm. In contrast to traditional spectral clustering algorithms that obtain clustering results in a two-step process, FSBGL obtains clustering results directly from the concatenated components of the optimized similarity matrix while learning the optimized bipartite graph. This eliminates the effect of performing these two steps separately in SC. In effect, a more discriminative low-dimensional representation may be derived from the adaptively learned bipartite graph, while the better low-dimensional representation can continue to be used to learn the structure of the graph. The learning of the bipartite graph alternates with the iteration of the low-dimensional representation, which allows the algorithm to obtain more accurate clustering results. Furthermore, by means of a low-dimensional representation on the basis of fast spectral embedding, the algorithm has better performance on some large-scale datasets. The results of the experiment indicate that the FSBGL is better than other comparative methods in various data sets.

Full Text