Balanced chromosomal rearrangements (BCRs) have been used for decades to identify disease loci, an approach which has been facilitated by global paired-end sequencing methods to detect chromosomal breakpoints at near sequence level. BCRs do not only cause disease by gene truncation, but also by removing cis-acting regulatory elements, e.g. tissue-specific enhancers, from specific target genes (long-range position effects – LRPE), and/or by bringing external regulatory elements into contact with a new target gene (enhancer adoption). In this study, we show that the majority of the known LRPE-associated breakpoints occur in Topologically Associating Domains (TADs) overlapping with clusters of evolutionary conserved non-genic elements (CNE), and where the dysregulated target genes frequently are marked by large transposon free promoters. By a genome-wide search we defined 316 TADs (cneTADs) with these conservation features, covering 18% of the genome, with 382 predicted target genes that are highly enriched in developmental genes including homeobox genes and other transcription factors. By systematic whole genome mate-pair sequencing, where we more than double the number of mapped two-way BCRs, and include the first large cohort of healthy BCR carriers, we show that >40% of all affected BCR-carriers, including all published ones, have one or two breakpoints in cneTADs, a significantly increase compared to the frequency in healthy BCR carriers. Our data support not only that cneTADs are candidate high-risk LRPE regions, but also that TAD-boundaries are LRPE-boundaries. Moreover, cneTADs are high-risk regions for enhancer adoption events, and a new extreme variant of this, enhancer swapping, where exchanges between two cneTADs may lead to unpredictable, novel phenotypes. Our concerted action will have the potential to establish a saturated map of genotype-phenotype links for this major part of the human developmental regulome, in addition to provide genotype-phenotype links for thousands of truncated protein-coding and non-coding genes.
Read full abstract7-days of FREE Audio papers, translation & more with Prime
7-days of FREE Prime access