Sparse subspace clustering (SSC) has been widely employed in machine learning and pattern recognition, but it still faces scalability challenges when dealing with large-scale datasets. Recently, stochastic SSC (SSSC) has emerged as an effective solution by leveraging the dropout technique. However, SSSC cannot robustly handle noise, especially non-Gaussian noise, leading to unsatisfactory clustering performance. To address the above issues, we propose a novel robust and stochastic method called stochastic sparse subspace clustering with the Huber function (S3CH). The key idea is to introduce the Huber surrogate to measure the loss of the stochastic self-expression framework, thus S3CH inherits the advantage of the stochastic framework for large-scale problems while mitigating sensitivity to non-Gaussian noise. In algorithms, an efficient proximal alternating minimization (PAM)-based optimization scheme is developed. In theory, the convergence of the generated sequence is rigorously proved. Extensive numerical experiments on synthetic and six real datasets validate the advantages of the proposed method in clustering accuracy, noise robustness, parameter sensitivity, post-hoc analysis, and model stability.
Read full abstract