Abstract
While the principal force directing coding sequence (CDS) evolution is selection on protein function, to ensure correct gene expression CDSs must also maintain interactions with RNA-binding proteins (RBPs). Understanding how our genes are shaped by these RNA-level pressures is necessary for diagnostics and for improving transgenes. However, the evolutionary impact of the need to maintain RBP interactions remains unresolved. Are coding sequences constrained by the need to specify RBP binding motifs? If so, what proportion of mutations are affected? Might sequence evolution also be constrained by the need not to specify motifs that might attract unwanted binding, for instance because it would interfere with exon definition? Here, we have scanned human CDSs for motifs that have been experimentally determined to be recognized by RBPs. We observe two sets of motifs—those that are enriched over nucleotide-controlled null and those that are depleted. Importantly, the depleted set is enriched for motifs recognized by non-CDS binding RBPs. Supporting the functional relevance of our observations, we find that motifs that are more enriched are also slower-evolving. The net effect of this selection to preserve is a reduction in the over-all rate of synonymous evolution of 2–3% in both primates and rodents. Stronger motif depletion, on the other hand, is associated with stronger selection against motif gain in evolution. The challenge faced by our CDSs is therefore not only one of attracting the right RBPs but also of avoiding the wrong ones, all while also evolving under selection pressures related to protein structure.
Highlights
One of the most captivating problems in molecular evolution is that of multiple coding À how the very same DNA sequence can contain several overlapping layers of information
This means that the evolution of coding sequences (CDSs) is directed by selection pressures related to the structure of the protein encoded for and by the need to preserve such overlapping regulatory information
Putative RNA-binding proteins (RBPs) Target Motifs Are Nonneutrally Evolving in CDSs, Leading to an Over-All Decrease of $2.4% in the Human Rate of Synonymous Evolution Putative RBP Target Motifs Are Enriched over Expected in CDSs Is the frequency of putative RBP target motifs in CDSs consistent with neutral expectations or are there deviations that would suggest the presence of selection? We retrieved data on the experimentally determined sequence specificities of human RBPs from several databases
Summary
One of the most captivating problems in molecular evolution is that of multiple coding À how the very same DNA sequence can contain several overlapping layers of information. Protein-coding regions can overlap with transcription factor binding sites (Stergachis et al 2013; Birnbaum et al 2014) ( the functionality of the sites is contested; Xing and He 2015; Agoglia and Fraser 2016), functional RNA secondary structures (Chamary and Hurst 2005; Meyer and Miklos 2005; Pedersen et al 2006; Smith et al 2013) and microRNA targets (Lewis et al 2005; Hurst 2006; Forman et al 2008; Fang and Rajewski 2011; Hausser et al 2013; Liu et al 2015) This means that the evolution of coding sequences (CDSs) is directed by selection pressures related to the structure of the protein encoded for and by the need to preserve such overlapping regulatory information.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.