Abstract
Assuming that mutation and fixation processes are reversible Markov processes, we prove that the equilibrium ensemble of sequences obeys a Boltzmann distribution with exp(4Nem(1−1/(2N))), where m is Malthusian fitness and Ne and N are effective and actual population sizes. On the other hand, the probability distribution of sequences with maximum entropy that satisfies a given amino acid composition at each site and a given pairwise amino acid frequency at each site pair is a Boltzmann distribution with exp(−ψN), where the evolutionary statistical energy ψN is represented as the sum of one body (h) (compositional) and pairwise (J) (covariational) interactions over all sites and site pairs. A protein folding theory based on the random energy model (REM) indicates that the equilibrium ensemble of natural protein sequences is well represented by a canonical ensemble characterized by exp(−ΔGND/kBTs) or by exp(−GN/kBTs) if an amino acid composition is kept constant, where ΔGND≡GN−GD,GN and GD are the native and denatured free energies, and Ts is the effective temperature representing the strength of selection pressure. Thus, 4Nem(1−1/(2N)),−ΔψND(≡−ψN+ψD), and −ΔGND/kBTs must be equivalent to each other. With h and J estimated by the DCA program, the changes (ΔψN) of ψN due to single nucleotide nonsynonymous substitutions are analyzed. The results indicate that the standard deviation of ΔGN(=kBTsΔψN) is approximately constant irrespective of protein families, and therefore can be used to estimate the relative value of Ts. Glass transition temperature Tg and ΔGND are estimated from estimated Ts and experimental melting temperature (Tm) for 14 protein domains. The estimates of ΔGND agree with their experimental values for 5 proteins, and those of Ts and Tg are all within a reasonable range. In addition, approximating the probability density function (PDF) of ΔψN by a log-normal distribution, PDFs of ΔψN and Ka/Ks, which is the ratio of nonsynonymous to synonymous substitution rate per site, in all and in fixed mutants are estimated. The equilibrium values of ψN, at which the average of Δψ in fixed mutants is equal to zero, well match ψN averaged over homologous sequences, confirming that the present methods for a fixation process of mutations and for the equilibrium ensemble of ψN give a consistent result with each other. The PDFs of Ka/Ks at equilibrium confirm that Ts negatively correlates with the amino acid substitution rate (the mean of Ka/Ks) of protein. Interestingly, stabilizing mutations are significantly fixed by positive selection, and balance with destabilizing mutations fixed by random drift, although most of them are removed from population. Supporting the nearly neutral theory, neutral selection is not significant even in fixed mutants.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.