Abstract

Subspace clustering is a celebrated problem that comes up in a variety of applications such as motion segmentation and face clustering. The goal of the problem is to find clusters in different subspaces from similarity measurements across data points. While the algorithmic aspect of this problem has been extensively studied in the literature, the information-theoretic limit on the number of similarities required for reliable clustering has been unknown. In this paper, we translate the problem into an instance of community recovery in hypergraphs, and characterize the sharp threshold on the limit required for exact subspace clustering. Moreover, we present a computationally efficient algorithm that achieves the fundamental limit.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call