Abstract
The growing maturity of single-cell RNA-sequencing (scRNA-seq) technology allows us to explore the heterogeneity of tissues, organisms, and complex diseases at cellular level. In single-cell data analysis, clustering calculation is very important. However, the high dimensionality of scRNA-seq data, the ever-increasing number of cells, and the unavoidable technical noise bring great challenges to clustering calculations. Motivated by the good performance of contrastive learning in multiple domains, we propose ScCCL, a novel self-supervised contrastive learning method for clustering of scRNA-seq data. ScCCL first randomly masks the gene expression of each cell twice and adds a small amount of Gaussian noise, and then uses the momentum encoder structure to extract features from the enhanced data. Contrastive learning is then applied in the instance-level contrastive learning module and the cluster-level contrastive learning module, respectively. After training, a representation model that can efficiently extract high-order embeddings of single cells is obtained. We selected two evaluation metrics, ARI and NMI, to conduct experiments on multiple public datasets. The results show that ScCCL improves the clustering effect compared with the benchmark algorithms. Notably, since ScCCL does not depend on a specific type of data, it can also be helpful in clustering analysis of single-cell multi-omics data.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE/ACM transactions on computational biology and bioinformatics
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.