Abstract

The propensity of protein sites to be occupied by any of the 20 amino acids is known as site-specific amino acid preferences (SSAP). Under the assumption that SSAP are conserved among homologs, they can be used to parameterize evolutionary models for the reconstruction of accurate phylogenetic trees. However, simulations and experimental studies have not been able to fully assess the relative conservation of SSAP as a function of sequence divergence between protein homologs. Here, we implement a computational procedure to predict the SSAP of proteins based on the effect of changes in thermodynamic stability upon mutation. An advantage of this computational approach is that it allows us to interrogate a large and unbiased sample of homologous proteins, over the entire spectrum of sequence divergence, and under selection for the same molecular trait. We show that computational predictions have reproducibilities that resemble those obtained in experimental replicates, and can largely recapitulate the SSAP observed in a large-scale mutagenesis experiment. Our results support recent experimental reports on the conservation of SSAP of related homologs, with a slowly increasing fraction of up to 15% of different sites at sequence distances lower than 40%. However, even under the sole contribution of thermodynamic stability, our conservative approach identifies up to 30% of significant different sites between divergent homologs. We show that this relation holds for homologs of diverse sizes and structural classes. Analyses of residue contact networks suggest that an important determinant of these differences is the increasing accumulation of structural deviations that results from sequence divergence.

Highlights

  • A variety of biophysical and evolutionary forces affect the process of amino acid substitution in protein sequences

  • Our analyses show that computational predictions have reproducibilities similar to those observed in experimental measurements of replicate preference profiles; and can largely recapitulate the sitespecific amino acid preferences (SSAP) reported in a mutagenesis experiment

  • Our analyses suggest that thermodynamic stability can substantially contribute to the SSAP of proteins, and that the computational procedure implemented here can recapitulate such contribution

Read more

Summary

Introduction

A variety of biophysical and evolutionary forces affect the process of amino acid substitution in protein sequences. Models describing the tempo and mode of amino acid substitutions are the core machinery for the detection of divergent homologs and the construction of accurate phylogenetic trees (Yang 2014). The simplest of these models assumes that sites evolve independently of other sites, and that transition rates between different amino acids at a given site are proportional to the overall amino acid abundance in proteins (Dayhoff et al 1978).

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call