Abstract

BackgroundDuring eukaryotic genome evolution, tandem gene duplication is the most frequent event giving rise to clustered gene families. However, how expression divergence between tandemly duplicated genes has emerged and maintained remain unclear. In particular, it is unknown if epigenetic regulators have been involved in the process.ResultsWe demonstrate that CCCTC-binding factor (CTCF), the master epigenetic regulator and the only known insulator protein in humans, has played a predominant role in generating divergence in both expression profiles and expression levels between adjacent paralogs in the human genome. This phenomenon was not observed for non-paralogous adjacent genes. After tandem duplication events, CTCF-binding sites gradually accumulate between paralogs. This trend was more prominent for genes involved in particular functions.ConclusionsThe accumulation of CTCF-binding sites drives expression divergence of tandemly duplicated genes. This process is likely targeted by natural selection. Our study reveals the importance of CTCF to the evolution of animal diversity and complexity.Electronic supplementary materialThe online version of this article (doi:10.1186/1471-2164-15-S1-S8) contains supplementary material, which is available to authorized users.

Highlights

  • During eukaryotic genome evolution, tandem gene duplication is the most frequent event giving rise to clustered gene families

  • To determine if the effect of intergenic distance, the number of CTCF-binding sites, and DNA methylation is related to shared evolutionary origin, we measured ExpD1-r and ExpDEuc in non-paralogous adjacent genes

  • CTCF binding can vary among cell types [25], when we define #CTCF using joint CTCF ChIP-seq peaks instead of overlapping CTCF peaks, the results did not change (Table S1 in Additional file 1)

Read more

Summary

Introduction

Tandem gene duplication is the most frequent event giving rise to clustered gene families. How expression divergence between tandemly duplicated genes has emerged and maintained remain unclear. It is unknown if epigenetic regulators have been involved in the process. Tandem duplication is the most common route to the formation of clustered paralogous genes [3,4]. To acquire novel transcription patterns, tandemly duplicated genes need to interrupt expression similarity due to shared upstream cis-elements upon origin [3,7] and transcriptional interference contributed by physical proximity [8]. Functionally important gene clusters consisting of paralogs with distinct expression patterns are found in a wide range of species, including human [9,10]. It is important to understand the origin and maintenance of expression divergence between tandem paralogs

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call