Bipartite spectral graph partitioning (BSGP) method as a co-clustering method, has been widely used in document clustering, which simultaneously clusters documents and words by making full use of the duality between documents and words. It consists of two steps: 1) graph construction and 2) singular value decomposition on the bipartite graph to compute a continuous cluster assignment matrix, followed by post-processing to get the discrete solution. However, the generated graph is unstructured and fixed. It heavily relies on the quality of the graph construction. Moreover, the two-stage process may deviate from directly solving the primal problem. In order to tackle these defects, a novel bipartite graph partitioning method is proposed to learn a bipartite graph with exactly c connected components (c is the number of clusters), which can obtain clustering results directly. Furthermore, it is experimentally and theoretically proved that the solution of the proposed model is the discrete solution of the primal BSGP problem for a special situation. Experimental results on synthetic and real-world datasets exhibit the superiority of the proposed method.
Read full abstract