Evaluating methods of inferring gene regulatory networks highlights their lack of performance for single cell gene expression data

Shuonan Chen,Jessica C Mar

doi:10.1186/s12859-018-2217-z

Shuonan Chen, Jessica C Mar

Open Access

https://doi.org/10.1186/s12859-018-2217-z

Copy DOI

Abstract

BackgroundA fundamental fact in biology states that genes do not operate in isolation, and yet, methods that infer regulatory networks for single cell gene expression data have been slow to emerge. With single cell sequencing methods now becoming accessible, general network inference algorithms that were initially developed for data collected from bulk samples may not be suitable for single cells. Meanwhile, although methods that are specific for single cell data are now emerging, whether they have improved performance over general methods is unknown. In this study, we evaluate the applicability of five general methods and three single cell methods for inferring gene regulatory networks from both experimental single cell gene expression data and in silico simulated data.ResultsStandard evaluation metrics using ROC curves and Precision-Recall curves against reference sets sourced from the literature demonstrated that most of the methods performed poorly when they were applied to either experimental single cell data, or simulated single cell data, which demonstrates their lack of performance for this task. Using default settings, network methods were applied to the same datasets. Comparisons of the learned networks highlighted the uniqueness of some predicted edges for each method. The fact that different methods infer networks that vary substantially reflects the underlying mathematical rationale and assumptions that distinguish network methods from each other.ConclusionsThis study provides a comprehensive evaluation of network modeling algorithms applied to experimental single cell gene expression data and in silico simulated datasets where the network structure is known. Comparisons demonstrate that most of these assessed network methods are not able to predict network structures from single cell expression data accurately, even if they are specifically developed for single cell methods. Also, single cell methods, which usually depend on more elaborative algorithms, in general have less similarity to each other in the sets of edges detected. The results from this study emphasize the importance for developing more accurate optimized network modeling methods that are compatible for single cell data. Newly-developed single cell methods may uniquely capture particular features of potential gene-gene relationships, and caution should be taken when we interpret these results.

Highlights

A fundamental fact in biology states that genes do not operate in isolation, and yet, methods that infer regulatory networks for single cell gene expression data have been slow to emerge
Most network inference methods cannot correctly reconstruct networks from simulated gene expression data, including those designed for single cells Evaluation of the network methods using PR and Receiver Operating Characteristic (ROC) curves [41] showed that all methods demonstrated poor performance when applied to the simulated datasets that mimic single cell experimental data (Fig. 2)
For all three single cell methods (SCENIC, SCODE and Partial Information Decomposition and Context (PIDC), with Single-Cell rEgulatory Network Inference and Clustering (SCENIC) only applied to embryonic stem cell (ESC) and Hematopoietic stem cell (HSC) dataset, see Methods), we found that there were fewer edges overlapping each of the learned networks, and the recovery of reference edges was even poorer (Fig. 6)

Summary

Introduction

A fundamental fact in biology states that genes do not operate in isolation, and yet, methods that infer regulatory networks for single cell gene expression data have been slow to emerge. With single cell sequencing methods becoming accessible, general network inference algorithms that were initially developed for data collected from bulk samples may not be suitable for single cells. There are features that are inherent to single cell gene expression data, that distinguish this data type from their bulk sample counterparts, and require additional attention as far as statistical analysis and bioinformatics modeling are concerned. For this reason, computational methods that were originally developed for bulk sample data may not necessarily be suitable for data generated from single cells. Single cell data has higher rates of zero values than bulk sample data. In a single cell setting, the higher rates of zero values mean that filtering or imputation approaches may distort the overall shape of the gene expression distribution substantially, and a more careful set of preprocessing rules is required [2, 3]

Objectives

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Jun 19, 2018
Citations: 199	License type: open-access

R Discovery Prime

R Discovery Prime

Evaluating methods of inferring gene regulatory networks highlights their lack of performance for single cell gene expression data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Distinct regional ontogeny and activation of tumor associated macrophages in human glioblastoma
Alexander P Landry ... Saira Alli
Scientific Reports | VOL. 10
Alexander P Landry, et. al.Alexander P Landry ... Saira Alli
11 Nov 2020
Scientific Reports | VOL. 10

Inferring gene regulatory networks from single-cell multiome data using atlas-scale external data.
Qiuyue Yuan ... Zhana Duren
Nature biotechnology | VOL. -
Qiuyue Yuan, et. al.Qiuyue Yuan ... Zhana Duren
12 Apr 2024
Nature biotechnology | VOL. -

Spatial transcriptomics with single cell resolution
Oliver Braubach
The Journal of Immunology | VOL. 204
Oliver BraubachOliver Braubach
01 May 2020
The Journal of Immunology | VOL. 204

Abstract 3392: Rapid, scalable isolation of human tumor nuclei for single cell genomics
Michael Gibbons ... Sarah Taylor
Cancer Research | VOL. 82
Michael Gibbons, et. al.Michael Gibbons ... Sarah Taylor
15 Jun 2022
Abstract 3392: Rapid, scalable isolation of human tumor nuclei for single cell genomics
Michael Gibbons ... Sarah Taylor

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Evaluating methods of inferring gene regulatory networks highlights their lack of performance for single cell gene expression data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics