Abstract

Despite an abundance of new studies about topologically associating domains (TADs), the role of genetic information in TAD formation is still not fully understood. Here we use our software, HiCExplorer (hicexplorer.readthedocs.io) to annotate >2800 high-resolution (570 bp) TAD boundaries in Drosophila melanogaster. We identify eight DNA motifs enriched at boundaries, including a motif bound by the M1BP protein, and two new boundary motifs. In contrast to mammals, the CTCF motif is only enriched on a small fraction of boundaries flanking inactive chromatin while most active boundaries contain the motifs bound by the M1BP or Beaf-32 proteins. We demonstrate that boundaries can be accurately predicted using only the motif sequences at open chromatin sites. We propose that DNA sequence guides the genome architecture by allocation of boundary proteins in the genome. Finally, we present an interactive online database to access and explore the spatial organization of fly, mouse and human genomes, available at http://chorogenome.ie-freiburg.mpg.de.

Highlights

  • Despite an abundance of new studies about topologically associating domains (TADs), the role of genetic information in TAD formation is still not fully understood

  • The CCCTC-binding factor (CTCF) protein has been shown to be enriched at chromatin loops, which demarcate a subset of TAD boundaries[5]

  • Apart from CTCF, the following DNA-binding insulator proteins have been associated to boundaries[3,10]: Boundary Element Associated Factor-32 (Beaf32), Suppressor of Hairy-wing (Su(Hw)), and GAGA factor (GAF)

Read more

Summary

Introduction

Despite an abundance of new studies about topologically associating domains (TADs), the role of genetic information in TAD formation is still not fully understood. The CTCF motif is only enriched on a small fraction of boundaries flanking inactive chromatin while most active boundaries contain the motifs bound by the M1BP or Beaf-32 proteins. We develop software (HiCExplorer) to obtain boundary positions at 0.5 kilobase resolution based on published Hi-C sequencing data from Drosophila melanogaster Kc167 cell line[15,16] Using these high-resolution TAD boundaries, we identify eight significantly enriched DNA-motifs. Five of these motifs are known to be bound by the insulator proteins: Beaf-32, CTCF, the heterodimer Ibf[1] and Ibf[2], Su(Hw) and ZIPIC. To facilitate exploration of available Hi-C data, we provide an interactive online database containing processed high-resolution Hi-C data sets from fly, mouse and human genome, available at http://chorogenome.ie-freiburg.mpg.de

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call