A clustering method for single-cell RNA-seq data based on automatic weighting penalty and low-rank representation.

Juan Wang,Zhenchang Wang,Chunhou Zheng,Shasha Yuan,Junliang Shang,Jinxing Liu

doi:10.1109/tcbb.2024.3362472

Abstract

Advances in high-throughput single-cell RNA sequencing (scRNA-seq) technology have provided more comprehensive biological information on cell expression. Clustering analysis is a critical step in scRNA-seq research and provides clear knowledge of the cell identity. Unfortunately, the characteristics of scRNA-seq data and the limitations of existing technologies make clustering encounter a considerable challenge. Meanwhile, some existing methods treat different features equally and ignore differences in feature contributions, which leads to a loss of information. To overcome limitations, we introduce a weighted distance constraint into the construction of the similarity graph and combine the similarity constraint. We propose the Joint Automatic Weighting Similarity Graph and Low-rank Representation (JAGLRR) clustering method. Evaluating the contributions of each feature and assigning various weight values can increase the significance of valuable features while decreasing the interference of redundant features. The similarity constraint allows the model to generate a more symmetric affinity matrix. Benefitting from that affinity matrix, JAGLRR recovers the original linear relationship of the data more accurately and obtains more discriminative information. The results on simulated datasets and 8 real datasets show that JAGLRR outperforms 11 existing comparison methods in clustering experiments, with higher clustering accuracy and stability.

Full Text