Dimension Reduction and Clustering Models for Single-Cell RNA Sequencing Data: A Comparative Study.

Chao Feng,Dan Li,Hao Zhang,Shufen Liu,Yanchun Liang,Renchu Guan,Fengfeng Zhou,Xiaoyue Feng

doi:10.3390/ijms21062181

Abstract

With recent advances in single-cell RNA sequencing, enormous transcriptome datasets have been generated. These datasets have furthered our understanding of cellular heterogeneity and its underlying mechanisms in homogeneous populations. Single-cell RNA sequencing (scRNA-seq) data clustering can group cells belonging to the same cell type based on patterns embedded in gene expression. However, scRNA-seq data are high-dimensional, noisy, and sparse, owing to the limitation of existing scRNA-seq technologies. Traditional clustering methods are not effective and efficient for high-dimensional and sparse matrix computations. Therefore, several dimension reduction methods have been introduced. To validate a reliable and standard research routine, we conducted a comprehensive review and evaluation of four classical dimension reduction methods and five clustering models. Four experiments were progressively performed on two large scRNA-seq datasets using 20 models. Results showed that the feature selection method contributed positively to high-dimensional and sparse scRNA-seq data. Moreover, feature-extraction methods were able to promote clustering performance, although this was not eternally immutable. Independent component analysis (ICA) performed well in those small compressed feature spaces, whereas principal component analysis was steadier than all the other feature-extraction methods. In addition, ICA was not ideal for fuzzy C-means clustering in scRNA-seq data analysis. K-means clustering was combined with feature-extraction methods to achieve good results.

Highlights

Owing to the development of microfluidics, large numbers of cells can be isolated [1]
Compared with the high fluctuation of the Independent component analysis (ICA)-based combinations, we discovered that the other eight combinations of hierarchical + principal component analysis (PCA), K-means + PCA, density-based spatial clustering of applications with noise (DBSCAN) + PCA, Louvain + PCA, hierarchical+ negative matrix factorization (NMF), K-means + NMF, fuzzy C-means + NMF, and Louvain + NMF all achieved red promotion (Figure 6)
With PCA feature extraction, these three types of cells could be categorized into four groups, asFsihgouwren7.inEfftehcetirveedneosvs aolf ignenFeigsuelreect8iobn. on mouse visual cortex data

Summary

Introduction

Owing to the development of microfluidics, large numbers of cells can be isolated [1]. Advances in RNA isolation and amplification have resulted in the application of RNA-sequencing (RNA-seq) technology to analyze the transcriptomes of single cells [2,3,4]. Large-scale single-cell data provide new methods to address biological problems; they pose specific analytical and technical challenges, such as high dimensionality, sparse matrix computation, and rare cell type detection [6,7]. The computational analysis of scRNA-seq data involves several steps, including quality control, mapping, quantification, dimensionality reduction, clustering, finding trajectories, and identifying differentially expressed genes [4]. Among these techniques, dimensionality reduction and clustering are two of the most important steps that have substantial effects on downstream analysis

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Molecular Sciences	Publication Date: Mar 22, 2020
Citations: 36	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Dimension Reduction and Clustering Models for Single-Cell RNA Sequencing Data: A Comparative Study.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Molecular Sciences

Lead the way for us

Similar Papers

SCMcluster: a high-precision cell clustering algorithm integrating marker gene set with single-cell RNA sequencing data.
Hao Wu ... Meili Wang
Briefings in functional genomics | VOL. 22
Hao Wu, et. al.Hao Wu ... Meili Wang
25 Feb 2023
Briefings in functional genomics | VOL. 22

LRT: Integrative analysis of scRNA-seq and scTCR-seq data to investigate clonal differentiation heterogeneity.
Juan Xie ... Mengjie Chen
PLOS Computational Biology | VOL. 19
Juan Xie, et. al.Juan Xie ... Mengjie Chen
10 Jul 2023
PLOS Computational Biology | VOL. 19

Identifying Genetic Signatures from Single-Cell RNA Sequencing Data by Matrix Imputation and Reduced Set Gene Clustering
Soumita Seth ... Tapas Bhadra
Mathematics | VOL. 11
Soumita Seth, et. al.Soumita Seth ... Tapas Bhadra
17 Oct 2023
Mathematics | VOL. 11

A Regularized Multi-Task Learning Approach for Cell Type Detection in Single-Cell RNA Sequencing Data.
Piu Upadhyay ... Sumanta Ray
Frontiers in genetics | VOL. 13
Piu Upadhyay, et. al.Piu Upadhyay ... Sumanta Ray
13 Apr 2022
Frontiers in genetics | VOL. 13

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Dimension Reduction and Clustering Models for Single-Cell RNA Sequencing Data: A Comparative Study.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Molecular Sciences