Abstract

Sampling is the key aspect for Nyström extension based spectral clustering. Traditional sampling schemes select the set of landmark points on a whole and focus on how to lower the matrix approximation error. However, the matrix approximation error does not have direct impact on the clustering performance. In this article, we propose a sampling framework from an incremental perspective, i.e., the landmark points are selected one by one, and each next point to be sampled is determined by previously selected landmark points. Incremental sampling builds explicit relationships among landmark points; thus, they work together well and provide a theoretical guarantee on the clustering performance. We provide two novel analysis methods and propose two schemes for selecting-the-next-one of the framework. The first scheme is based on clusterability analysis, which provides a better guarantee on clustering performance than schemes based on matrix approximation error analysis. The second scheme is based on loss analysis, which provides maximized predictive ability of the landmark points on the (implicit) labels of the unsampled points. Experimental results on a wide range of benchmark datasets demonstrate the superiorities of our proposed incremental sampling schemes over existing sampling schemes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call