Unsupervised clustering is a crucial issue in data mining and pattern recognition. Based on deep learning paradigms, deep clustering algorithms have been studied extensively and obtained superior performance in various applications. However, most of previous methods did not use helpful information from neighborhood relations to form group-separated space, and the feature embedding is usually distorted during the training process. To tackle the former limitation, we develop a graph convolution based unsupervised learning algorithm named Stacked Graph Autoencoder (SGAE). Specifically, SGAE utilizes the message passing mechanism to aggregate information from neighbors and obtain a meaningful and separated latent representation. Since the adjacency matrix is unavailable in clustering tasks, a graph construction approach with two pruning strategies is introduced to generate a transition matrix. To reduce the distortion caused by the multi-layered network training process, we further propose a topological structure preservation mechanism. It uses the constructed adjacency graph as supervised information, to maintain the relationship between nodes in the original space. Experiments on several popular benchmark datasets show that SGAE achieves significant improvements compared to unsupervised and semi-supervised deep clustering methods.
Read full abstract