Local Epigenomic Data are more Informative than Local Genome Sequence Data in Predicting Enhancer-Promoter Interactions Using Neural Networks.

Mengli Xiao,Wei Pan,Zhong Zhuang

doi:10.3390/genes11010041

Abstract

Enhancer-promoter interactions (EPIs) are crucial for transcriptional regulation. Mapping such interactions proves useful for understanding disease regulations and discovering risk genes in genome-wide association studies. Some previous studies showed that machine learning methods, as computational alternatives to costly experimental approaches, performed well in predicting EPIs from local sequence and/or local epigenomic data. In particular, deep learning methods were demonstrated to outperform traditional machine learning methods, and using DNA sequence data alone could perform either better than or almost as well as only utilizing epigenomic data. However, most, if not all, of these previous studies were based on randomly splitting enhancer-promoter pairs as training, tuning, and test data, which has recently been pointed out to be problematic; due to multiple and duplicating/overlapping enhancers (and promoters) in enhancer-promoter pairs in EPI data, such random splitting does not lead to independent training, tuning, and test data, thus resulting in model over-fitting and over-estimating predictive performance. Here, after correcting this design issue, we extensively studied the performance of various deep learning models with local sequence and epigenomic data around enhancer-promoter pairs. Our results confirmed much lower performance using either sequence or epigenomic data alone, or both, than reported previously. We also demonstrated that local epigenomic features were more informative than local sequence data. Our results were based on an extensive exploration of many convolutional neural network (CNN) and feed-forward neural network (FNN) structures, and of gradient boosting as a representative of traditional machine learning.

Highlights

Non-coding genome sequences, including enhancers, promoters, and other regulatory elements, play important roles in transcriptional regulation
We compared the performance of the two data sources with similar models side by side in Figure 5b, where the basic convolutional neural network (CNN), Residual Neural Network (ResNet) CNN and gradient boosting were customized to the two data sources during the training process. p-values of the paired t-test to compare model weighted average Area Under Receiver Operating Characteristic (AUROC) were all
Through an extensive evaluation of the use of various neural networks, especially convolutional neural networks (CNNs), on predicting enhancer-promoter interactions (EPIs), we demonstrated neural networks (CNNs), on predicting enhancer-promoter interactions (EPIs), we demonstrated that that local epigenomic features were more predictive than local sequence data

Summary

Introduction

Non-coding genome sequences, including enhancers, promoters, and other regulatory elements, play important roles in transcriptional regulation. Through enhancer-promoter interactions (i.e., physical contacts), the enhancers and promoters coordinately regulate gene expression. Enhancers can be distal from promoters in the genome, they are brought close to, and possibly in contact with, each other in the 3-D space through chromatin looping. Some enhancers even bypass adjacent promoters to interact with the target promoters in response to histone or transcriptional modifications on the genome. An accurate mapping of such distant interactions is of particular interest for understanding gene expression pathways and identifying target genes of GWAS loci [1,2,3].

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Genes	Publication Date: Dec 29, 2019
Citations: 5	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Local Epigenomic Data are more Informative than Local Genome Sequence Data in Predicting Enhancer-Promoter Interactions Using Neural Networks.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Genes

Lead the way for us

Similar Papers

A simple convolutional neural network for prediction of enhancer-promoter interactions with DNA sequence data.
Zhong Zhuang ... Xiaotong Shen
Bioinformatics | VOL. 35
Zhong Zhuang, et. al.Zhong Zhuang ... Xiaotong Shen
14 Jan 2019
Bioinformatics | VOL. 35

Deep convolutional neural network and IoT technology for healthcare.
Sobia Wassan ... N Z Jhanjhi
DIGITAL HEALTH | VOL. 10
Sobia Wassan, et. al.Sobia Wassan ... N Z Jhanjhi
01 Jan 2024
DIGITAL HEALTH | VOL. 10

Performance analysis of sentiment classification based on deep learning methods
Yuting Zhao
Applied and Computational Engineering | VOL. 5
Yuting ZhaoYuting Zhao
31 May 2023
Applied and Computational Engineering | VOL. 5

EpiAlignment: alignment with both DNA sequence and epigenomic data.
Jia Lu ... Sheng Zhong
Nucleic acids research | VOL. 47
Jia Lu, et. al.Jia Lu ... Sheng Zhong
22 May 2019
Nucleic acids research | VOL. 47

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Local Epigenomic Data are more Informative than Local Genome Sequence Data in Predicting Enhancer-Promoter Interactions Using Neural Networks.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Genes