Regulatory network-based imputation of dropouts in single-cell RNA sequencing data.

Ana Carolina Leote,Andreas Beyer,Xiaohui Wu

doi:10.1371/journal.pcbi.1009849

Abstract

Single-cell RNA sequencing (scRNA-seq) methods are typically unable to quantify the expression levels of all genes in a cell, creating a need for the computational prediction of missing values ('dropout imputation'). Most existing dropout imputation methods are limited in the sense that they exclusively use the scRNA-seq dataset at hand and do not exploit external gene-gene relationship information. Further, it is unknown if all genes equally benefit from imputation or which imputation method works best for a given gene. Here, we show that a transcriptional regulatory network learned from external, independent gene expression data improves dropout imputation. Using a variety of human scRNA-seq datasets we demonstrate that our network-based approach outperforms published state-of-the-art methods. The network-based approach performs particularly well for lowly expressed genes, including cell-type-specific transcriptional regulators. Further, the cell-to-cell variation of 11.3% to 48.8% of the genes could not be adequately imputed by any of the methods that we tested. In those cases gene expression levels were best predicted by the mean expression across all cells, i.e. assuming no measurable expression variation between cells. These findings suggest that different imputation methods are optimal for different genes. We thus implemented an R-package called ADImpute (available via Bioconductor https://bioconductor.org/packages/release/bioc/html/ADImpute.html) that automatically determines the best imputation method for each gene in a dataset. Our work represents a paradigm shift by demonstrating that there is no single best imputation method. Instead, we propose that imputation should maximally exploit external information and be adapted to gene-specific features, such as expression level and expression variation across cells.

Highlights

Single-cell RNA sequencing has become a routine method, revolutionizing our understanding of biological processes as diverse as tumor evolution, embryonic development, and ageing
Network-based dropout imputation carcinoma data are available in the Single Cell Portal
Human embryonic kidney (HEK) cell data are available in ArrayExpress

Summary

Introduction

Single-cell RNA sequencing (scRNA-seq) has become a routine method, revolutionizing our understanding of biological processes as diverse as tumor evolution, embryonic development, and ageing. Current technologies still suffer from the problem that large numbers of genes remain undetected in single cells, they are expressed (dropout events). The dropout rate is dependent on the sampling depth, i.e. the number of reads or transcript molecules (determined with unique molecular identifiers, UMIs) quantified in a given cell. Genes with regulatory functions—e.g. transcription factors, kinases, regulatory non-coding RNAs (ncRNAs)—are typically lowly expressed and prone to be missed in scRNAseq experiments. This poses problems for the interpretation of the experiments if one aims at understanding the regulatory processes responsible for the transcriptional makeup of the given cell

Objectives

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLOS Computational Biology	Publication Date: Feb 17, 2022
Citations: 12	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Regulatory network-based imputation of dropouts in single-cell RNA sequencing data.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS Computational Biology

Lead the way for us

Similar Papers

Regulatory network-based imputation of dropouts in single-cell RNA sequencing data
Ana Carolina Leote ... Andreas Beyer
-
Ana Carolina Leote, et. al.Ana Carolina Leote ... Andreas Beyer
17 Feb 2022
17 Feb 2022

Missing Value Recovery for Single Cell RNA Sequencing Data
Wenjuan Zhang ... William Yang
-
Wenjuan Zhang, et. al.Wenjuan Zhang ... William Yang
01 Dec 2021
01 Dec 2021

Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors.
Laleh Haghverdi ... Michael D Morgan
Nature Biotechnology | VOL. 36
Laleh Haghverdi, et. al.Laleh Haghverdi ... Michael D Morgan
02 Apr 2018
Nature Biotechnology | VOL. 36

Identification of Five Hub Genes Based on Single-Cell RNA Sequencing Data and Network Pharmacology in Patients With Acute Myocardial Infarction.
Ziguang Song ... Pingping Gao
Frontiers in Public Health | VOL. 10
Ziguang Song, et. al.Ziguang Song ... Pingping Gao
09 Jun 2022
Frontiers in Public Health | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Regulatory network-based imputation of dropouts in single-cell RNA sequencing data.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS Computational Biology