Abstract
Modern functional genomics uncovered numerous functional elements in metazoan genomes. Nevertheless, only a small fraction of the typical non-exonic genome contains elements that code for function directly. On the other hand, a much larger fraction of the genome is associated with significant evolutionary constraints, suggesting that much of the non-exonic genome is weakly functional. Here we show that in flies, local (30–70 bp) conserved sequence elements that are associated with multiple regulatory functions serve as focal points to a pattern of punctuated regional increase in G/C nucleotide frequencies. We show that this pattern, which covers a region tenfold larger than the conserved elements themselves, is an evolutionary consequence of a shift in the balance between gain and loss of G/C nucleotides and that it is correlated with nucleosome occupancy across multiple classes of epigenetic state. Evidence for compensatory evolution and analysis of SNP allele frequencies show that the evolutionary regime underlying this balance shift is likely to be non-neutral. These data suggest that current gaps in our understanding of genome function and evolutionary dynamics are explicable by a model of sparse sequence elements directly encoding for function, embedded into structural sequences that help to define the local and global epigenomic context of such functional elements.
Highlights
The molecular function of metazoan genomes has been studied extensively in the last decades, using progressively more extensive and sensitive techniques for profiling genome activity, modeling epigenomic organization and perturbing genome sequences
Evolutionary genomics, on the other hand, consistently suggests that a much larger fraction of the un-annotated genome evolves under selective pressure. We hypothesize that this function-selection gap can be attributed to sequences that facilitate the physical organization of functional elements, such as transcription factor binding sites, within chromosomes
We exemplify this by studying in detail the sequences embedding small conserved elements (CEs) in Drosophila
Summary
The molecular function of metazoan genomes has been studied extensively in the last decades, using progressively more extensive and sensitive techniques for profiling genome activity, modeling epigenomic organization and perturbing genome sequences. Genomes have been found to encode regulatory information affecting diverse functions, including gene expression, chromatin structure, recombination and replication Despite this progress, only a small percentage of the e.g., fly, or human genome is annotated with a well-defined molecular role. Recent progress with the mapping and analysis of local chromatin structure next to transcription start sites showed that nucleosome packaging is correlated with transcription, and likely affecting gene expression and other biological processes. These data suggest that the functionality of regulatory sequences is modulated by the organization of nucleosomes around or over them [2,3]. A major portion of the nonexonic sequences that are not directly coding for regulatory interactions is involved with (and selected for) defining the structure (i.e. nucleosome organization) of the genome
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.