Abstract

Cellular gene expression changes throughout a dynamic biological process, such as differentiation. Pseudotimes estimate cells' progress along a dynamic process based on their individual gene expression states. Ordering the expression data by pseudotime provides information about the underlying regulator-gene interactions. Because the pseudotime distribution is not uniform, many standard mathematical methods are inapplicable for analyzing the ordered gene expression states. Here we present single-cell inference of networks using Granger ensembles (SINGE), an algorithm for gene regulatory network inference from ordered single-cell gene expression data. SINGE uses kernel-based Granger causality regression to smooth irregular pseudotimes and missing expression values. It aggregates predictions from an ensemble of regression analyses to compile a ranked list of candidate interactions between transcriptional regulators and target genes. In two mouse embryonic stem cell differentiation datasets, SINGE outperforms other contemporary algorithms. However, a more detailed examination reveals caveats about poor performance for individual regulators and uninformative pseudotimes.

Highlights

  • Identifying the underlying gene regulatory networks (GRNs) that dictate cell fate decisions is important for understanding biological systems

  • We introduce our single-cell inference of networks using Granger ensembles (SINGE) algorithm, an ensemble-based GRN reconstruction technique that uses modified Granger causality on single-cell data annotated with pseudotimes

  • SINGE and Granger causality overview SINGE takes ordered single-cell gene expression data as input and provides a ranked list of regulator-gene relationships as its primary output. It requires the single-cell dataset to be annotated with pseudotimes. This assigns a numeric pseudotime to each cell in the dataset that represents how far that cell has progressed through a dynamic biological process such as differentiation

Read more

Summary

Introduction

Identifying the underlying gene regulatory networks (GRNs) that dictate cell fate decisions is important for understanding biological systems. Advances in single-cell transcriptomics, such as single-cell RNA-seq (scRNA-seq), have enabled observation of the gene expression states of individual cells (Tanay and Regev, 2017; Trapnell, 2015; Bacher and Kendziorski, 2016). These solve the averaging problem faced by bulk transcriptomics, they are beset with new technical challenges, including measurement dropouts and a lower signal-to-noise ratio. Methods like GENIE3 (Huynh-Thu et al, 2010), which was originally designed to infer GRNs from bulk transcriptomics data using tree-based ensembles, can be adapted for single-cell datasets. Timestamped single-cell data enable analyzing the evolution of gene expression distributions over time (Papili Gao et al, 2017), which is not possible with bulk time series data or single-cell data collected at one time point

Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call