Entropy-Based Incomplete Cholesky Decomposition for a Scalable Spectral Clustering Algorithm: Computational Studies and Sensitivity Analysis

Rocco Langone,Marc Van Barel,Johan Suykens

doi:10.3390/e18050182

Abstract

Spectral clustering methods allow datasets to be partitioned into clusters by mapping the input datapoints into the space spanned by the eigenvectors of the Laplacian matrix. In this article, we make use of the incomplete Cholesky decomposition (ICD) to construct an approximation of the graph Laplacian and reduce the size of the related eigenvalue problem from N to m, with m ≪ N . In particular, we introduce a new stopping criterion based on normalized mutual information between consecutive partitions, which terminates the ICD when the change in the cluster assignments is below a given threshold. Compared with existing ICD-based spectral clustering approaches, the proposed method allows the reduction of the number m of selected pivots (i.e., to obtain a sparser model) and at the same time, to maintain high clustering quality. The method scales linearly with respect to the number of input datapoints N and has low memory requirements, because only matrices of size N × m and m × m are calculated (in contrast to standard spectral clustering, where the construction of the full N × N similarity matrix is needed). Furthermore, we show that the number of clusters can be reliably selected based on the gap heuristics computed using just a small matrix R of size m × m instead of the entire graph Laplacian. The effectiveness of the proposed algorithm is tested on several datasets.

Highlights

In this paper, we deal with the data clustering problem
We introduce a spectral clustering algorithm that exploits the incomplete Cholesky decomposition to reduce the size of the eigenvalue problem
The incomplete Cholesky decomposition (ICD) [21] allows the reduciton of the computational time required by the Cholesky decomposition by computing a low rank approximation of accuracy τ of the matrix A in O(m2 N ), such that

Summary

Introduction

Clustering refers to a technique for partitioning unlabeled data into natural groups, where data points that are related to each other are grouped together and points that are dissimilar are assigned to different groups [1] In this context, spectral clustering [2,3,4,5] has been shown to be among the most successful methods in many application domains, due mainly to its ability to discover nonlinear clustering boundaries. The algorithm is based on computing the eigendecomposition of a matrix derived from the data called Laplacian. We introduce a spectral clustering algorithm that exploits the incomplete Cholesky decomposition to reduce the size of the eigenvalue problem.

Spectral Clustering

Incomplete Cholesky Decomposition

A Reduced Eigenvalue Problem

Proposed Algorithm

New Stopping Criterion

Choosing the Number of Clusters

Result

Experimental Results

Computational Complexity and Memory Requirements

Conclusions

SBO 100031 Belgian Federal Science Policy Office

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Entropy	Publication Date: May 13, 2016
Citations: 39	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Entropy-Based Incomplete Cholesky Decomposition for a Scalable Spectral Clustering Algorithm: Computational Studies and Sensitivity Analysis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy

Lead the way for us

Similar Papers

Spectral representation learning for one-step spectral rotation clustering
Guoqiu Wen ... Wei Zheng
Neurocomputing | VOL. 406
Guoqiu Wen, et. al.Guoqiu Wen ... Wei Zheng
12 Mar 2020
Neurocomputing | VOL. 406

Sparse spectral clustering method based on the incomplete Cholesky decomposition
Katrijn Frederix ... Marc Van Barel
Journal of Computational and Applied Mathematics | VOL. 237
Katrijn Frederix, et. al.Katrijn Frederix ... Marc Van Barel
24 Jul 2012
Journal of Computational and Applied Mathematics | VOL. 237

A Selective Overview of Recent Advances in Spectral Clustering and Their Applications
Yang Xu ... Lingzhou Xue
-
Yang Xu, et. al.Yang Xu ... Lingzhou Xue
01 Jan 2020
01 Jan 2020

An efficient Nyström spectral clustering algorithm using incomplete Cholesky decomposition
Hongjie Jia ... Shifei Ding
Expert Systems With Applications | VOL. 186
Hongjie Jia, et. al.Hongjie Jia ... Shifei Ding
31 Aug 2021
Expert Systems With Applications | VOL. 186

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Entropy-Based Incomplete Cholesky Decomposition for a Scalable Spectral Clustering Algorithm: Computational Studies and Sensitivity Analysis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy