Abstract

Single-cell RNA-Sequencing (scRNA-Seq) is a fast-evolving technology that enables the understanding of biological processes at an unprecedentedly high resolution. However, well-suited bioinformatics tools to analyze the data generated from this new technology are still lacking. Here we investigate the performance of non-negative matrix factorization (NMF) method to analyze a wide variety of scRNA-Seq datasets, ranging from mouse hematopoietic stem cells to human glioblastoma data. In comparison to other unsupervised clustering methods including K-means and hierarchical clustering, NMF has higher accuracy in separating similar groups in various datasets. We ranked genes by their importance scores (D-scores) in separating these groups, and discovered that NMF uniquely identifies genes expressed at intermediate levels as top-ranked genes. Finally, we show that in conjugation with the modularity detection method FEM, NMF reveals meaningful protein-protein interaction modules. In summary, we propose that NMF is a desirable method to analyze heterogeneous single-cell RNA-Seq data. The NMF based subpopulation detection package is available at: https://github.com/lanagarmire/NMFEM.

Highlights

  • The advancement of technologies has enabled researchers to separate individual cells from bulk and sequence their transcriptomes at the single cell level, known as single-cell RNASequencing

  • It was used to detect heterogeneity within the cell population, and it has greatly enhanced our understanding of the regulatory programs involved in systems such as glioblastoma (Patel et al, 2014), neuronal cells (Usoskin et al, 2014), or pluripotent stem cells (PSCs) (Kumar et al, 2014)

  • We demonstrate the capabilities of negative matrix factorization (NMF) in scRNA-Seq data analysis in these following aspects: (1) accurate clustering of single cells from different conditions in an unsupervised manner; (2) detection of important genes associated with differences among subclasses

Read more

Summary

Introduction

The advancement of technologies has enabled researchers to separate individual cells from bulk and sequence their transcriptomes at the single cell level, known as single-cell RNASequencing (scRNA-Seq). This technology has reached an unprecedented fine resolution to reveal the program of gene expression within cells (Kumar et al, 2014). It was used to detect heterogeneity within the cell population, and it has greatly enhanced our understanding of the regulatory programs involved in systems such as glioblastoma (Patel et al, 2014), neuronal cells (Usoskin et al, 2014), or pluripotent stem cells (PSCs) (Kumar et al, 2014). It will continue to provide more transformative insights in the near future (Pan, 2014; Poirion et al, 2016)

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.