Linear-time cluster ensembles of large-scale single-cell RNA-seq and multimodal data.

Van Hoan Do,Stefan Canzar,Francisca Rojas Ringeling

doi:10.1101/gr.267906.120

Abstract

A fundamental task in single-cell RNA-seq (scRNA-seq) analysis is the identification of transcriptionally distinct groups of cells. Numerous methods have been proposed for this problem, with a recent focus on methods for the cluster analysis of ultralarge scRNA-seq data sets produced by droplet-based sequencing technologies. Most existing methods rely on a sampling step to bridge the gap between algorithm scalability and volume of the data. Ignoring large parts of the data, however, often yields inaccurate groupings of cells and risks overlooking rare cell types. We propose method Specter that adopts and extends recent algorithmic advances in (fast) spectral clustering. In contrast to methods that cluster a (random) subsample of the data, we adopt the idea of landmarks that are used to create a sparse representation of the full data from which a spectral embedding can then be computed in linear time. We exploit Specter's speed in a cluster ensemble scheme that achieves a substantial improvement in accuracy over existing methods and identifies rare cell types with high sensitivity. Its linear-time complexity allows Specter to scale to millions of cells and leads to fast computation times in practice. Furthermore, on CITE-seq data that simultaneously measures gene and protein marker expression, we show that Specter is able to use multimodal omics measurements to resolve subtle transcriptomic differences between subpopulations of cells.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Genome research	Publication Date: Feb 24, 2021
Citations: 14	License type: cc-by-nc

R Discovery Prime

R Discovery Prime

Linear-time cluster ensembles of large-scale single-cell RNA-seq and multimodal data.

Abstract

Talk to us

Similar Papers

More From: Genome research

Lead the way for us

Similar Papers

Fast Constrained Spectral Clustering and Cluster Ensemble with Random Projection.
Wenfen Liu ... Mao Ye
Computational intelligence and neuroscience | VOL. 2017
Wenfen Liu, et. al.Wenfen Liu ... Mao Ye
01 Jan 2017
Computational intelligence and neuroscience | VOL. 2017

Learning discriminative and structural samples for rare cell types with deep generative model.
Haiyue Wang ... Xiaoke Ma
Briefings in Bioinformatics | VOL. 23
Haiyue Wang, et. al.Haiyue Wang ... Xiaoke Ma
01 Aug 2022
Briefings in Bioinformatics | VOL. 23

Author response: Proximity labeling of protein complexes and cell-type-specific organellar proteomes in Arabidopsis enabled by TurboID
Andrea Mair ... Tess C Branon
-
Andrea Mair, et. al.Andrea Mair ... Tess C Branon
06 Sep 2019
06 Sep 2019

Anchor-based fast spectral ensemble clustering
Runxin Zhang ... Xuelong Li
Information Fusion | VOL. 113
Runxin Zhang, et. al.Runxin Zhang ... Xuelong Li
18 Jul 2024
Information Fusion | VOL. 113

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Linear-time cluster ensembles of large-scale single-cell RNA-seq and multimodal data.

Abstract

Talk to us

Similar Papers

More From: Genome research