Abstract

Multi-manifold clustering is among the most fundamental tasks in signal processing and machine learning. Although the existing multi-manifold clustering methods are quite powerful, learning the cluster number automatically from data is still a challenge. In this paper, a novel unsupervised generative clustering approach within the Bayesian nonparametric framework has been proposed. Specifically, our manifold method automatically selects the cluster number with a Dirichlet Process (DP) prior. Then, a DP-based mixture model with constrained Mixture of Gaussians (MoG) is constructed to handle the manifold data. Finally, we integrate our model with the k-nearest neighbor graph to capture the manifold geometric information. An efficient optimization algorithm has also been derived to do the model inference and optimization. Experimental results on synthetic datasets and real-world benchmark datasets exhibit the effectiveness of this new DP-based manifold method.

Highlights

  • Over the past decades, clustering has been the most fundamental task in many computer vision and data mining applications [1,2], e.g., image/motion segmentation [3,4], community detection [5], feature selection [6] and biological/network information analysis [7,8]

  • Clustering accuracy in our experiment was measured through Normalized Mutual Information (NMI) [32]

  • Suppose U = {U1, U2, U3, ..., U|U | } denotes the real cluster labels obtained from the ground truth and V = {V1, V2, V3, ..., V|V | } obtained from a clustering algorithm. |U | and |V | denote the cluster number

Read more

Summary

Introduction

Over the past decades, clustering has been the most fundamental task in many computer vision and data mining applications [1,2], e.g., image/motion segmentation [3,4], community detection [5], feature selection [6] and biological/network information analysis [7,8]. Most of the conventional clustering methods assume that data samples are scattered in the feature space, which ignores the intrinsic underlying data structure that many real datasets have [3,9]. To overcome this problem, various manifold-based clustering (multi-manifold clustering) methods have been proposed and developed. Sparse Subspace Clustering (SSC)- [16], Low-Rank Representation (LRR)- [17] and Least Squares Regression (LSR)-based [18] methods approach the linear manifold clustering problem by finding a sparse representation of each point in terms of other data points.

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call