A machine learning framework for the prediction of chromatin folding in Drosophila using epigenetic features.

Michal B Rozenwald,Mikhail S Gelfand,Ekaterina E Khrameeva,Grigory V Sapunov,Aleksandra A Galitsyna

doi:10.7717/peerj-cs.307

Abstract

Technological advances have lead to the creation of large epigenetic datasets, including information about DNA binding proteins and DNA spatial structure. Hi-C experiments have revealed that chromosomes are subdivided into sets of self-interacting domains called Topologically Associating Domains (TADs). TADs are involved in the regulation of gene expression activity, but the mechanisms of their formation are not yet fully understood. Here, we focus on machine learning methods to characterize DNA folding patterns in Drosophila based on chromatin marks across three cell lines. We present linear regression models with four types of regularization, gradient boosting, and recurrent neural networks (RNN) as tools to study chromatin folding characteristics associated with TADs given epigenetic chromatin immunoprecipitation data. The bidirectional long short-term memory RNN architecture produced the best prediction scores and identified biologically relevant features. Distribution of protein Chriz (Chromator) and histone modification H3K4me3 were selected as the most informative features for the prediction of TADs characteristics. This approach may be adapted to any similar biological dataset of chromatin features across various cell lines and species. The code for the implemented pipeline, Hi-ChiP-ML, is publicly available: https://github.com/MichalRozenwald/Hi-ChIP-ML

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PeerJ. Computer science	Publication Date: Nov 30, 2020
Citations: 15	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A machine learning framework for the prediction of chromatin folding in Drosophila using epigenetic features.

Abstract

Talk to us

Similar Papers

More From: PeerJ. Computer science

Lead the way for us

Similar Papers

Decision letter: Reorganisation of Hoxd regulatory landscapes during the evolution of a snake-like body plan
Robb Krumlauf
-
Robb KrumlaufRobb Krumlauf
15 May 2016
15 May 2016

Prediction of 3D Chromatin Structure Using Recurrent Neural Networks
Michal Rozenwald ... Grigory Sapunov
-
Michal Rozenwald, et. al.Michal Rozenwald ... Grigory Sapunov
01 Dec 2018
01 Dec 2018

Developmentally regulated Shh expression is robust to TAD perturbations.
Iain Williamson ... Robert E Hill
Development | VOL. 146
Iain Williamson, et. al.Iain Williamson ... Robert E Hill
01 Jan 2019
Development | VOL. 146

Methods for the Analysis of Topologically Associating Domains (TADs).
Marie Zufferey ... Daniele Tavernari
Methods in molecular biology (Clifton, N.J.) | VOL. 2301
Marie Zufferey, et. al.Marie Zufferey ... Daniele Tavernari
21 Aug 2021
Methods in molecular biology (Clifton, N.J.) | VOL. 2301

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A machine learning framework for the prediction of chromatin folding in Drosophila using epigenetic features.

Abstract

Talk to us

Similar Papers

More From: PeerJ. Computer science