Abstract

Deep clustering refers to joint representation learning and clustering using deep neural networks. Existing methods can be mainly categorized into two types: discriminative and generative methods. The former learns representations for clustering with discriminative mechanisms directly, and the latter estimate the latent distribution of each cluster for generating data points and then infers cluster assignments. Although generative methods have the advantage of estimating the latent distributions of clusters, their performances still significantly fall behind discriminative methods. In this work, we argue that this performance gap might be partly due to the overlap of data distribution of different clusters. In fact, there is little guarantee of generative methods to separate the distributions of different clusters in the data space. To tackle these problems, we theoretically prove that mutual information maximization promotes the separation of different clusters in the data space, which provides a theoretical justification for deep generative clustering with mutual information maximization. Our theoretical analysis directly leads to a model which integrates a hierarchical generative adversarial network and mutual information maximization. Moreover, we further propose three techniques and empirically show their effects to stabilize and enhance the model. The proposed approach notably outperforms other generative models for deep clustering on public benchmarks.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call