Abstract
Adaptive mutations play an important role in molecular evolution. However, the frequency and nature of these mutations at the intramolecular level are poorly understood. To address this, we analyzed the impact of protein architecture on the rate of adaptive substitutions, aiming to understand how protein biophysics influences fitness and adaptation. Using Drosophila melanogaster and Arabidopsis thaliana population genomics data, we fitted models of distribution of fitness effects and estimated the rate of adaptive amino-acid substitutions both at the protein and amino-acid residue level. We performed a comprehensive analysis covering genome, gene, and protein structure, by exploring a multitude of factors with a plausible impact on the rate of adaptive evolution, such as intron number, protein length, secondary structure, relative solvent accessibility, intrinsic protein disorder, chaperone affinity, gene expression, protein function, and protein–protein interactions. We found that the relative solvent accessibility is a major determinant of adaptive evolution, with most adaptive mutations occurring at the surface of proteins. Moreover, we observe that the rate of adaptive substitutions differs between protein functional classes, with genes encoding for protein biosynthesis and degradation signaling exhibiting the fastest rates of protein adaptation. Overall, our results suggest that adaptive evolution in proteins is mainly driven by intermolecular interactions, with host–pathogen coevolution likely playing a major role.
Highlights
A long-standing focus in the study of molecular evolution is the role of natural selection in protein evolution (Eyre-Walker 2006)
In order to identify the genomic and structural variants driving protein adaptive evolution, we looked at 10,318 protein-coding genes in 114 Drosophila melanogaster genomes, analyzing polymorphism data from an admixed sub-Saharan population from Phase 2 of the Drosophila Population Genomics Project (DPGP2, Pool et al 2012) and divergence out to D. simulans; and 18,669 protein-coding genes in 110 Arabidopsis thaliana genomes, with polymorphism data from a Spanish population (1001 Genomes Project, Weigel and Mott 2009) and divergence to A. lyrata
Grapes estimates the rate of nonadaptive nonsynonymous substitutions, which is used to estimate the rate of adaptive nonsynonymous substitutions and the proportion of adaptive nonsynonymous substitutions (a)
Summary
A long-standing focus in the study of molecular evolution is the role of natural selection in protein evolution (Eyre-Walker 2006). When looking at adaptive and nonadaptive substitutions separately, we observe a significant negative impact on values of xa in D. melanogaster and xna in A. thaliana
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have