Spectral Clustering is an effective preprocessing method in communities for its excellent performance, but its scalability still is a challenge. Many efforts have been made to face this problem, and several solutions are proposed, including Nyström Approximation, Sparse Representation Approximation, etc. However, according to our survey, there is still a large room for improvement. This work thoroughly investigates the factors relevant to large-scale Spectral Clustering and proposes a general framework to accelerate Spectral Clustering by utilizing the Robust and Efficient Spectral k-Means (RESKM). The contributions of RESKM are three folds: (1) a unified framework is proposed for large-scale Spectral Clustering; (2) it consists of four phases, each phase is theoretically analyzed, and the corresponding acceleration is suggested; (3) the majority of the existing large-scale Spectral Clustering methods can be integrated into RESKM and therefore be accelerated. Experiments on datasets with different scalability demonstrate that the robustness and efficiency of RESKM.
Read full abstract