The geometry of kernelized spectral clustering

Geoffrey Schiebinger,Bin Yu,Martin J Wainwright

doi:10.1214/14-aos1283

Abstract

Clustering of data sets is a standard problem in many areas of science and engineering. The method of spectral clustering is based on embedding the data set using a kernel function, and using the top eigenvectors of the normalized Laplacian to recover the connected components. We study the performance of spectral clustering in recovering the latent labels of i.i.d. samples from a finite mixture of nonparametric distributions. The difficulty of this label recovery problem depends on the overlap between mixture components and how easily a mixture component is divided into two nonoverlapping components. When the overlap is small compared to the indivisibility of the mixture components, the principal eigenspace of the population-level normalized Laplacian operator is approximately spanned by the square-root kernelized component densities. In the finite sample setting, and under the same assumption, embedded samples from different components are approximately orthogonal with high probability when the sample size is large. As a corollary we control the fraction of samples mislabeled by spectral clustering under finite mixtures with nonparametric components.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: The Annals of Statistics	Publication Date: Apr 1, 2015
Citations: 73	License type: unspecified-oa

R Discovery Prime

R Discovery Prime

The geometry of kernelized spectral clustering

Abstract

Talk to us

Similar Papers

More From: The Annals of Statistics

Lead the way for us

Similar Papers

Spectral representation learning for one-step spectral rotation clustering
Guoqiu Wen ... Wei Zheng
Neurocomputing | VOL. 406
Guoqiu Wen, et. al.Guoqiu Wen ... Wei Zheng
12 Mar 2020
Neurocomputing | VOL. 406

Spectral clustering in high-dimensions: Necessary and sufficient conditions for dense and sparse mixtures
Martin J Wainwright
-
Martin J WainwrightMartin J Wainwright
01 Sep 2008
01 Sep 2008

A Comparison of Methods for Dimensionality Assessment of Categorical Item Responses
Chen-Wen Liu ... Wen-Chung Wang
-
Chen-Wen Liu, et. al.Chen-Wen Liu ... Wen-Chung Wang
01 Jan 2015
01 Jan 2015

Spectral Embedded Clustering: A Framework for In-Sample and Out-of-Sample Spectral Clustering
Feiping Nie ... I W Tsang
IEEE Transactions on Neural Networks | VOL. 22
Feiping Nie, et. al. Feiping Nie ... I W Tsang
29 Sep 2011
IEEE Transactions on Neural Networks | VOL. 22

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The geometry of kernelized spectral clustering

Abstract

Talk to us

Similar Papers

More From: The Annals of Statistics