Abstract

BackgroundDimensionality reduction is an indispensable analytic component for many areas of single-cell RNA sequencing (scRNA-seq) data analysis. Proper dimensionality reduction can allow for effective noise removal and facilitate many downstream analyses that include cell clustering and lineage reconstruction. Unfortunately, despite the critical importance of dimensionality reduction in scRNA-seq analysis and the vast number of dimensionality reduction methods developed for scRNA-seq studies, few comprehensive comparison studies have been performed to evaluate the effectiveness of different dimensionality reduction methods in scRNA-seq.ResultsWe aim to fill this critical knowledge gap by providing a comparative evaluation of a variety of commonly used dimensionality reduction methods for scRNA-seq studies. Specifically, we compare 18 different dimensionality reduction methods on 30 publicly available scRNA-seq datasets that cover a range of sequencing techniques and sample sizes. We evaluate the performance of different dimensionality reduction methods for neighborhood preserving in terms of their ability to recover features of the original expression matrix, and for cell clustering and lineage reconstruction in terms of their accuracy and robustness. We also evaluate the computational scalability of different dimensionality reduction methods by recording their computational cost.ConclusionsBased on the comprehensive evaluation results, we provide important guidelines for choosing dimensionality reduction methods for scRNA-seq data analysis. We also provide all analysis scripts used in the present study at www.xzlab.org/reproduce.html.

Highlights

  • Single-cell RNA sequencing is a rapidly growing and widely applying technology [1,2,3]

  • We evaluated the ability of different dimensionality reduction methods in preserving the original feature of the expression matrix, and, more importantly, their effectiveness for two important singlecell analytic tasks: cell clustering and lineage inference

  • The results show that six dimensionality reduction methods, principal component analysis (PCA), independent components analysis (ICA), factor analysis (FA), ZINBWaVE, Multidimensional scaling (MDS), and uniform manifold approximation and projection (UMAP), often achieve both accurate clustering performance and highly stable and consistent results across the subsets

Read more

Summary

Introduction

Single-cell RNA sequencing (scRNA-seq) is a rapidly growing and widely applying technology [1,2,3]. Because of the importance of dimensionality reduction in scRNA-seq analysis, many dimensionality reduction methods have been developed and are routinely used in scRNA-seq software tools that include, but not limited to, cell clustering tools [12, 13] and lineage reconstruction tools [14]. Most commonly used scRNA-seq clustering methods rely on dimensionality reduction as the first analytic step [15]. Dimensionality reduction is an indispensable analytic component for many areas of single-cell RNA sequencing (scRNA-seq) data analysis. Proper dimensionality reduction can allow for effective noise removal and facilitate many downstream analyses that include cell clustering and lineage reconstruction. Despite the critical importance of dimensionality reduction in scRNA-seq analysis and the vast number of dimensionality reduction methods developed for scRNA-seq studies, few comprehensive comparison studies have been performed to evaluate the effectiveness of different dimensionality reduction methods in scRNA-seq

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.