Abstract

BackgroundHi-C experiments capturing the 3D genome architecture have led to the discovery of topologically-associated domains (TADs) that form an important part of the 3D genome organization and appear to play a role in gene regulation and other functions. Several histone modifications have been independently associated with TAD formation, but their combinatorial effects on domain formation remain poorly understood at a global scale.ResultsWe propose a convex semi-nonparametric approach called nTDP based on Bernstein polynomials to explore the joint effects of histone markers on TAD formation as well as predict TADs solely from the histone data. We find a small subset of modifications to be predictive of TADs across species. By inferring TADs using our trained model, we are able to predict TADs across different species and cell types, without the use of Hi-C data, suggesting their effect is conserved. This work provides the first comprehensive joint model of the effect of histone markers on domain formation.ConclusionsOur approach, nTDP, can form the basis of a unified, explanatory model of the relationship between epigenetic marks and topological domain structures. It can be used to predict domain boundaries for cell types, species, and conditions for which no Hi-C data is available. The model may also be of use for improving Hi-C-based domain finders.

Highlights

  • Hi-C experiments capturing the 3D genome architecture have led to the discovery of topologicallyassociated domains (TADs) that form an important part of the 3D genome organization and appear to play a role in gene regulation and other functions

  • Chromatin interactions obtained from a variety of recent chromosome conformation capture experimental techniques such as Hi-C [5] have resulted in significant advances in our understanding of the geometry of chromatin structure [6, 7]

  • Analysis of the resulting matrix by Dixon et al [8] led to the discovery of topologically-associated domains (TADs) which correspond to consecutive, highly-interacting matrix regions typically a few megabases in size that are close in densely packed chromatin

Read more

Summary

Results

Experimental setup We binned ChIP-Seq histone modification and DNaseseq data at 40 kb resolution, estimate RPKM (Reads Per Kilobase per Million) measure for each bin, and transform values x in each bin by log(x + 1) , which reduces the distorting effects of high values. These significance values combined with the results above suggest the importance of the identified modifications in TADs. These significance values combined with the results above suggest the importance of the identified modifications in TADs To verify that there are not inherent structures in the data that can lead to an easy prediction, we randomly shuffle domains in the training set by preserving their lengths without shuffling modifications, which NVI score is never better than 0.3 in all chromosomes showing the importance of histone modification distributions in TADs. nTDP predicts TADs accurately across different species as well as across different cell types as in Fig. 4b– d. We suggest that some of our wrong TAD predictions may correspond to longer TAD blocks which we erroneously interpret as incorrect due to a scale mismatch

Conclusions
Background
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call