Abstract

BackgroundRecent advances in sequencing technologies have enabled parallel assays of chromatin accessibility and gene expression for major human cell lines. Such innovation provides a great opportunity to decode phenotypic consequences of genetic variation via the construction of predictive gene regulatory network models. However, there still lacks a computational method to systematically integrate chromatin accessibility information with gene expression data to recover complicated regulatory relationships between genes in a tissue-specific manner.ResultsWe propose a Markov random field (MRF) model for constructing tissue-specific transcriptional regulatory networks via integrative analysis of DNase-seq and RNA-seq data. Our method, named CSNets (cell-line specific regulatory networks), first infers regulatory networks for individual cell lines using chromatin accessibility information, and then fine-tunes these networks using the MRF based on pairwise similarity between cell lines derived from gene expression data. Using this method, we constructed regulatory networks specific to 110 human cell lines and 13 major tissues with the use of ENCODE data. We demonstrated the high quality of these networks via comprehensive statistical analysis based on ChIP-seq profiles, functional annotations, taxonomic analysis, and literature surveys. We further applied these networks to analyze GWAS data of Crohn’s disease and prostate cancer. Results were either consistent with the literature or provided biological insights into regulatory mechanisms of these two complex diseases. The website of CSNets is freely available at http://bioinfo.au.tsinghua.edu.cn/jianglab/CSNETS/.ConclusionsCSNets demonstrated the power of joint analysis on epigenomic and transcriptomic data towards the accurate construction of gene regulatory network. Our work provides not only a useful resource of regulatory networks to the community, but also valuable experiences in methodology development for multi-omics data integration.

Highlights

  • Recent advances in sequencing technologies have enabled parallel assays of chromatin accessibility and gene expression for major human cell lines

  • Marbach et al introduced a resource of 394 human gene regulatory networks by integrating transcription factors (TFs) binding motifs with Cap Analysis of Gene Expression (CAGE) data from the FANTOM5 project [12]

  • Motivated by the above understanding, we propose in this paper a Markov random field (MRF) model, named CSNets (Cell-line Specific regulatory Networks), that integrates DNase-seq data with RNA-seq data towards large-scale inference of gene regulatory networks

Read more

Summary

Introduction

Recent advances in sequencing technologies have enabled parallel assays of chromatin accessibility and gene expression for major human cell lines. Such innovation provides a great opportunity to decode phenotypic consequences of genetic variation via the construction of predictive gene regulatory network models. The complicated process of transcription in eukaryotes largely attributes to the collaboration among DNA regulatory elements, RNA polymerases, mediator and cohesion complexes, and sequence-specific transcription factors (TFs). Such collaboration is encoded in a comprehensive gene regulatory network that determines how the expression of a gene is regulated, what responses a. Marbach et al introduced a resource of 394 human gene regulatory networks by integrating TF binding motifs with Cap Analysis of Gene Expression (CAGE) data from the FANTOM5 project [12]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call