Image clustering is a complex procedure that is significantly affected by the choice of the image representation. Generally speaking, image representations are generated by using handcraft features or trained neural networks. When dealing with high dimension data, these two representation methods cause two problems: i) the representation ability of the manually designed features is limited; ii) the non-representative and meaningless feature of a trained deep network may hurt the clustering performance. To overcome these problems, we propose a new clustering method which efficiently builds an image representation and precisely discovers the cluster assignments. Our main tools are an unsupervised representation learning method based on Deep Mutual Information Maximization (DMIM) system, and a clustering method based on self-training algorithm. Specifically speaking, to extract the informative representation of image data, we derive the maximum mutual information theory and propose a system to learn the maximum mutual information between the input images and the latent representations. To discover the clusters and assign each image a clustering label, a self-training mechanism is applied to cluster the learned representations. The superiority and validity of our algorithm are verified in a series of real-world image clustering experiments.
Read full abstract