Abstract

As an important branch of unsupervised learning methods, clustering makes a wide contribution in the area of data mining. It is well known that capturing the group-discriminative properties of each sample for clustering is crucial. Among them, deep clustering delivers promising results due to the strong representational power of neural networks. However, most of them adopt sample-level learning strategies, and the standalone data point barely captures its holistic cluster’s context and may undergo sub-optimal cluster assignment. To tackle this issue, we propose a Structure-driven Representation Learning (SRL) method by introducing latent structure information into the representation learning process at both the local and global levels. Specifically, a local-structure-driven sample representation strategy is proposed to approximate the estimation of data distribution, which models the neighborhood distribution of samples with potential structure information and exploits statistical dependencies between them to improve cluster consistency. A global-structure-driven cluster representation strategy is designed, where the context of each cluster is sufficiently encoded according to its samples (exemplar-theory) and corresponding prototype (prototype-theory). In this case, each cluster can only be related to its most similar samples, and different clusters are separated as much as possible. These two models are seamlessly combined into a joint optimization problem, which can be efficiently solved. Experiments on six widely-used datasets demonstrate the superiority of SRL over state-of-the-art clustering methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call