Predicting gene expression from genome wide protein binding profiles

Mohsina M Ferdous,Paul Wilson,Xiaohui Liu,Veronica Vinciotti,Yanchun Bao

doi:10.1016/j.neucom.2017.09.094

Abstract

Abstract High-throughput technologies such as chromatin immunoprecipitation (IP) followed by next generation sequencing (ChIP-seq) in combination with gene expression studies have enabled researchers to investigate relationships between the distribution of chromosome-associated proteins and the regulation of gene transcription on a genome-wide scale. Several attempts at integrative analyses have identified direct relationships between the two processes. However, a comprehensive understanding of the regulatory events remains elusive. This is in part due to the scarcity of robust analytical methods for the detection of binding regions from ChIP-seq data. In this paper, we have applied a recently proposed Markov random field model for the detection of enriched binding regions under different biological conditions and time points. The method accounts for spatial dependencies and IP efficiencies, which can vary significantly between different experiments. We further defined the enriched chromosomal binding regions as distinct genomic features, such as promoter, exon, intron, and distal intergenic, and then investigated how predictive each of these features are of gene expression activity using machine learning techniques, including neural networks, decision trees and random forest. The analysis of a ChIP-seq time-series dataset comprising six protein markers and associated microarray data, obtained from the same biological samples, shows promising results and identified biologically plausible relationships between the protein profiles and gene regulation.

Highlights

Chromatin immunoprecipitation combined with massively parallel DNA sequencing (ChIP-seq) is a method used to identify the binding sites of chromosome-associated/‘epigenetic’ proteins (Note that the term epigenetic will be used in its broadest sense throughout this manuscript.)
All data values were collected from murine bone-marrow derived macrophages (BMDMs), stimulated with lipopolysaccharide (LPS), and from LPS stimulated BMDMs treated with a synthetic compound (I-BET)
The epigenetic data was generated from a ChIP-seq time-series dataset that included quantification of bromodomain-containing protein 4 (Brd4); acetylated histone H4 (H4ac); histone H3 lysine 4 tri-methylation (H3K4me3); RNA polymerase II (RNA PolII); subunit of RNA polymerase II (RNA PolII S2); and cyclin-dependent kinase 9 (CDK9)

Summary

Introduction

Chromatin immunoprecipitation combined with massively parallel DNA sequencing (ChIP-seq) is a method used to identify the binding sites of chromosome-associated/‘epigenetic’ proteins (Note that the term epigenetic will be used in its broadest sense throughout this manuscript.). ChIP-seq in combination with gene expression data enables researchers to investigate relationships between chromosomal-bound protein regulatory mechanisms and gene expression responses on a genome-wide scale. There are many studies where ChIP-seq data is in the public domain but the corresponding gene expression data is not available: and again, it is not possible to understand how epigenetic modifications dictate gene expression responses [8]. We propose that machine learning data models could be used to address such situations, by modelling the mechanistic relationships between observed gene expression responses and the corresponding epigenetic modifications. Once the association between gene expression and epigenetic regulatory events is defined, it should be possible to predict one from the other and extrapolate this information into a deeper understanding of gene regulation mechanisms

Objectives

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Neurocomputing	Publication Date: Oct 6, 2017
Citations: 2	License type: cc-by

R Discovery Prime

R Discovery Prime

Predicting gene expression from genome wide protein binding profiles

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Neurocomputing

Lead the way for us

Similar Papers

Microarray and genome-wide sequencing approaches to characterizing DNA binding molecules

-

01 Jan 2009
01 Jan 2009

Genome-wide Analysis of Chromatin Regulation by Cocaine Reveals a Role for Sirtuins
William Renthal ... Eric J Nestler
Neuron | VOL. 62
William Renthal, et. al.William Renthal ... Eric J Nestler
01 May 2009
Neuron | VOL. 62

Recruitment of SWI/SNF Complex Is Required for Transcriptional Activation of the SLC11A1 Gene during Macrophage Differentiation of HL-60 Cells
Yong Zhong Xu ... Danuta Radzioch
Journal of Biological Chemistry | VOL. 286
Yong Zhong Xu, et. al.Yong Zhong Xu ... Danuta Radzioch
01 Apr 2011
Journal of Biological Chemistry | VOL. 286

From reads to regions: a Bioconductor workflow to detect differential binding in ChIP-seq data.
Aaron T L Lun ... Gordon K Smyth
F1000Research | VOL. 4
Aaron T L Lun, et. al.Aaron T L Lun ... Gordon K Smyth
11 Jan 2016
F1000Research | VOL. 4

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Predicting gene expression from genome wide protein binding profiles

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Neurocomputing