Abstract
Genome-wide co-expression analysis is often used for annotating novel gene functions from high-dimensional data. Here, we developed an R package with a Shiny visualization app that creates immuno-networks from RNAseq data using a combination of Weighted Gene Co-expression Network Analysis (WGCNA), xCell immune cell signatures, and Bayesian Network Learning. Using a large publicly available RNAseq dataset we generated a Gene Module-Immune Cell (GMIC) network that predicted causal relationships between DEAH-box RNA helicase (DHX)15 and genes associated with humoral immunity, suggesting that DHX15 may regulate B cell fate. Deletion of DHX15 in mouse B cells led to impaired lymphocyte development, reduced peripheral B cell numbers, and dysregulated expression of genes linked to antibody-mediated immune responses similar to the genes predicted by the GMIC network. Moreover, antigen immunization of mice demonstrated that optimal primary IgG1 responses required DHX15. Intrinsic expression of DHX15 was necessary for proliferation and survival of activated of B cells. Altogether, these results support the use of co-expression networks to elucidate fundamental biological processes.
Highlights
The technological advances in the “Omics” field generating high-dimensional datasets requires advanced mathematics and computational biology models [1]
As a proof of concept, we generated a Gene Module-Immune Cell (GMIC) network from a publicly available large RNA expression dataset from lymphoma patient biopsies
We use an immune centric workflow analysis for large expression data that sequentially combines two co-expression analysis methods with xCell signature algorithm and GOstats to generate a GMIC network. This in silico model predicted a novel function for DHX15 during B cell-dependent immune responses by influencing modules containing MHC class II-associated genes, Tyrosine-Based Activation Motif-Bearing Adapter Protein (TYROBP) and Transferrin Receptor (TFRC)
Summary
The technological advances in the “Omics” field generating high-dimensional datasets requires advanced mathematics and computational biology models [1] Analysis of these large “Omics” datasets through machine learning methods provides an important source for discovering biological processes. Two statistical tools commonly used for analysis of genome-wide expression data to predict gene function and disease association through gene-modules are Weighted Gene Co-expression Network Analysis (WGCNA) and Bayesian Network learning [4, 5]. Both methods use sample to sample variation to generate co-expression networks, Bayesian Network learning searches for parent to child relationships from observational data by testing different possible combinations
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.