Abstract

How to discriminate distal regulatory elements to a gene target is challenging in understanding gene regulation and illustrating causes of complex diseases. Among known distal regulatory elements, enhancers interact with a target gene’s promoter to regulate its expression. Although the emergence of many machine learning approaches has been able to predict enhancer-promoter interactions (EPIs), global and precise prediction of EPIs at the genomic level still requires further exploration.In this paper, we develop an integrated EPIs prediction method, called EpPredictor with improved performance. By using various features of histone modifications, transcription factor binding sites, and DNA sequences among the human genome, a robust supervised machine learning algorithm, named LightGBM, is introduced to predict enhancer-promoter interactions (EPIs). Among six different cell lines, our method effectively predicts the enhancer-promoter interactions (EPIs) and achieves better performance in F1-score and AUC compared to other methods, such as TargetFinder and PEP.

Highlights

  • Enhancers are key cis-elements that regulate spatiotemporal gene expression by contacting with their target genes

  • We developed EpPredictor to predict EP interaction based on LightGBM

  • Many methods only use the features from the promoter region and enhancer region to predict enhancer-promoter interactions (EPIs)

Read more

Summary

Introduction

Enhancers are key cis-elements that regulate spatiotemporal gene expression by contacting with their target genes. One cognate gene can be controlled by multiple enhancers; in turn, one enhancer can interact with more than one target gene 4. These create a complicated and nonlinear regulation network. By reconstructing regulatory landscapes from different features and integrating hundreds of genomics data sets, TargetFinder can accurately predict individual enhancer-promoter interactions using the features from enhancer, promoter, and window region between promoters and enhancer. It consists of two modules (i.e., the PEP- motif and the PEP-word), which use different feature extraction methods These researches provide insight into how epigenetic features and sequences correlated to EPIs. Different from all the above methods, we develop a LightGBM-based algorithm to predict enhancer-promoter interactions named EpPredictor.

Datasets
Determine the region of feature extraction
Epigenomic features extraction
Sequence feature extraction
Choose LightGBM as well as determine the region and feature selection
Method
Findings
Discussion and Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call