Tuning-free sparse clustering via alternating hard-thresholding

Wei Dong,Chen Xu,Jinhan Xie,Niansheng Tang

doi:10.1016/j.jmva.2024.105330

Abstract

Model-based clustering is a commonly-used technique to partition heterogeneous data into homogeneous groups. When the analysis is to be conducted with a large number of features, analysts face simultaneous challenges in model interpretability, clustering accuracy, and computational efficiency. Several Bayesian and penalization methods have been proposed to select important features for model-based clustering. However, the performance of those methods relies on a careful algorithmic tuning, which can be time-consuming for high-dimensional cases. In this paper, we propose a new sparse clustering method based on alternating hard-thresholding. The new method is conceptually simple and tuning-free. With a user-specified sparsity level, it efficiently detects a set of key features by eliminating a large number of features that are less useful for clustering. Based on the selected key features, one can readily obtain an effective clustering of the original high-dimensional data under a general sparse covariance structure. Under mild conditions, we show that the new method leads to clusters with a misclassification rate consistent to the optimal rate as if the underlying true model were used. The promising performance of the new method is supported by both simulated and real data examples.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Tuning-free sparse clustering via alternating hard-thresholding

Abstract

Talk to us

Similar Papers

More From: Journal of Multivariate Analysis

Lead the way for us

Similar Papers

Iterative Reclassification in Agglomerative Clustering
Nicholas A Heard
Journal of Computational and Graphical Statistics | VOL. 20
Nicholas A HeardNicholas A Heard
01 Jan 2010
Journal of Computational and Graphical Statistics | VOL. 20

Mixtures of factor analyzers with covariates for modeling multiply censored dependent variables
Wan-Lun Wang ... Tsung-I Lin
Statistical Papers | VOL. 62
Wan-Lun Wang, et. al.Wan-Lun Wang ... Tsung-I Lin
30 May 2020
Statistical Papers | VOL. 62

Flexible clustering via extended mixtures of common t-factor analyzers
Wan-Lun Wang ... Tsung-I Lin
AStA Advances in Statistical Analysis | VOL. 101
Wan-Lun Wang, et. al.Wan-Lun Wang ... Tsung-I Lin
02 Nov 2016
AStA Advances in Statistical Analysis | VOL. 101

Model-based clustering of censored data via mixtures of factor analyzers
Wan-Lun Wang ... Tsung-I Lin
Computational Statistics & Data Analysis | VOL. 140
Wan-Lun Wang, et. al.Wan-Lun Wang ... Tsung-I Lin
19 Jun 2019
Computational Statistics & Data Analysis | VOL. 140

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Tuning-free sparse clustering via alternating hard-thresholding

Abstract

Talk to us

Similar Papers

More From: Journal of Multivariate Analysis