Abstract

BackgroundGene co-expression networks (GCNs) are powerful tools that enable biologists to examine associations between genes during different biological processes. With the advancement of new technologies, such as single-cell RNA sequencing (scRNA-seq), there is a need for developing novel network methods appropriate for new types of data.ResultsWe present a novel sparse Bayesian factor model to explore the network structure associated with genes in scRNA-seq data. Latent factors impact the gene expression values for each cell and provide flexibility to account for common features of scRNA-seq: high proportions of zero values, increased cell-to-cell variability, and overdispersion due to abnormally large expression counts. From our model, we construct a GCN by analyzing the positive and negative associations of the factors that are shared between each pair of genes.ConclusionsSimulation studies demonstrate that our methodology has high power in identifying gene-gene associations while maintaining a nominal false discovery rate. In real data analyses, our model identifies more known and predicted protein-protein interactions than other competing network models.

Highlights

  • Gene co-expression networks (GCNs) are powerful tools that enable biologists to examine associations between genes during different biological processes

  • We present a sparse hierarchical Bayesian factor model to explore the network structure associated with genes

  • We found that the false discovery rate (FDR) for SCODE and GENIE3 tend to remain fairly stable across different threshold choices, and the FDR of PIDC tends to increase as the threshold increases

Read more

Summary

Introduction

Gene co-expression networks (GCNs) are powerful tools that enable biologists to examine associations between genes during different biological processes. Deriving co-expression networks from gene expression data is a primary goal in numerous biological studies These networks, which are commonly referred to as gene coexpression networks (GCNs), are constructed by identifying pairs of genes that have significant associations between their expression profiles across samples. Genes are represented by nodes in GCNs and co-expression values are represented by edges that connect pairs of nodes These edges are undirected to indicate the relationships or dependencies between genes, not the underlying cause of these associations. This makes GCNs. Sekula et al BMC Bioinformatics (2020) 21:361 different from gene regulatory networks, which have directed edges to infer causal relationships [1]. Researchers are able to identify novel interactions and relationships between genes by exploring GCNs [3, 4]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call