Resolving single-cell heterogeneity from hundreds of thousands of cells through sequential hybrid clustering and NMF.

Meenakshi Venkatasubramanian,Nathan Salomonis,Kashish Chetal,Daniel J Schnell,Gowtham Atluri

doi:10.1093/bioinformatics/btaa201

Meenakshi Venkatasubramanian, Nathan Salomonis + Show 3 more

Open Access

https://doi.org/10.1093/bioinformatics/btaa201

Copy DOI

Abstract

MotivationThe rapid proliferation of single-cell RNA-sequencing (scRNA-Seq) technologies has spurred the development of diverse computational approaches to detect transcriptionally coherent populations. While the complexity of the algorithms for detecting heterogeneity has increased, most require significant user-tuning, are heavily reliant on dimension reduction techniques and are not scalable to ultra-large datasets. We previously described a multi-step algorithm, Iterative Clustering and Guide-gene Selection (ICGS), which applies intra-gene correlation and hybrid clustering to uniquely resolve novel transcriptionally coherent cell populations from an intuitive graphical user interface.ResultsWe describe a new iteration of ICGS that outperforms state-of-the-art scRNA-Seq detection workflows when applied to well-established benchmarks. This approach combines multiple complementary subtype detection methods (HOPACH, sparse non-negative matrix factorization, cluster ‘fitness’, support vector machine) to resolve rare and common cell-states, while minimizing differences due to donor or batch effects. Using data from multiple cell atlases, we show that the PageRank algorithm effectively downsamples ultra-large scRNA-Seq datasets, without losing extremely rare or transcriptionally similar yet distinct cell types and while recovering novel transcriptionally distinct cell populations. We believe this new approach holds tremendous promise in reproducibly resolving hidden cell populations in complex datasets.Availability and implementationICGS2 is implemented in Python. The source code and documentation are available at http://altanalyze.org.Supplementary information Supplementary data are available at Bioinformatics online.

Highlights

Recent advances in single cell RNA sequencing provide exciting new opportunities to understand cellular and molecular diversity in healthy tissues and disease
Using data from the Human Cell Atlas, we show that the PageRank algorithm effectively down samples ultra-large scRNASeq datasets, without losing extremely rare or transcriptionally similar distinct cell-types and while recovering novel transcriptionally unique cell populations
While the specific algorithms and options used for these steps varies significantly among applications, most approach rely heavily on dimensionality reduction techniques, such as PCA, t-SNE and UMAP to select features and define cell populations

Summary

Introduction

Recent advances in single cell RNA sequencing (scRNA-Seq) provide exciting new opportunities to understand cellular and molecular diversity in healthy tissues and disease. While the specific algorithms and options used for these steps varies significantly among applications, most approach rely heavily on dimensionality reduction techniques, such as PCA, t-SNE and UMAP to select features and define cell populations. While a number of methods exist to identify clusters from large lower dimensional projections, including DBSCAN, K-means, affinity propagation, Louvain clustering and spectral clustering, these, as well as other approaches require proper hyperparameter tuning. Identifying these parameters is non-intuitive and often requires multiple rounds of analysis. The increasing production of atlas sized datasets highlights the important need for highly scalable and automated computational approaches that can rapidly identify common and extremely rare populations with minimal user parameter tweaking 5

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Bioinformatics	Publication Date: Mar 24, 2020
Citations: 46	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Resolving single-cell heterogeneity from hundreds of thousands of cells through sequential hybrid clustering and NMF.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Bioinformatics

Lead the way for us

Similar Papers

Decision letter: The single-cell chromatin accessibility landscape in mouse perinatal testis development
Deborah Bourc'his ... Marianne E Bronner
-
Deborah Bourc'his, et. al.Deborah Bourc'his ... Marianne E Bronner
31 Jan 2022
31 Jan 2022

Decision letter: Single-cell RNA sequencing of the Strongylocentrotus purpuratus larva reveals the blueprint of major cell types and nervous system of a non-chordate deuterostome
Pedro Martinez Serra ... Marianne E Bronner
-
Pedro Martinez Serra, et. al.Pedro Martinez Serra ... Marianne E Bronner
06 Jul 2021
06 Jul 2021

Hippocampal Pyramidal Neurons Comprise Two Distinct Cell Types that Are Countermodulated by Metabotropic Receptors
Austin R Graves ... Nelson Spruston
Neuron | VOL. 76
Austin R Graves, et. al.Austin R Graves ... Nelson Spruston
01 Nov 2012
Neuron | VOL. 76

SAR target feature extraction based on sparse constraint nonnegative matrix factorization
Xin Gao ... Qi Zhang
-
Xin Gao, et. al.Xin Gao ... Qi Zhang
01 Dec 2012
01 Dec 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Resolving single-cell heterogeneity from hundreds of thousands of cells through sequential hybrid clustering and NMF.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Bioinformatics