Abstract

That amino acid properties are responsible for the way protein molecules evolve is natural and is also reasonably well supported both by the structure of the genetic code and, to a large extent, by the experimental measures of the amino acid similarity. Nevertheless, there remains a significant gap between observed similarity matrices and their reconstructions from amino acid properties. Therefore, we introduce a simple theoretical model of amino acid similarity matrices, which allows splitting the matrix into two parts – one that depends only on mutabilities of amino acids and another that depends on pairwise similarities between them. Then the new synthetic amino acid properties are derived from the pairwise similarities and used to reconstruct similarity matrices covering a wide range of information entropies. Our model allows us to explain up to 94% of the variability in the BLOSUM family of the amino acids similarity matrices in terms of amino acid properties. The new properties derived from amino acid similarity matrices correlate highly with properties known to be important for molecular evolution such as hydrophobicity, size, shape and charge of amino acids. This result closes the gap in our understanding of the influence of amino acids on evolution at the molecular level. The methods were applied to the single family of similarity matrices used often in general sequence homology searches, but it is general and can be used also for more specific matrices. The new synthetic properties can be used in analyzes of protein sequences in various biological applications.

Highlights

  • The connection between amino acid properties and molecular evolution was proposed very soon after the discovery of the latter [1,2,3,4]

  • Amino acid similarity matrices The amino acid similarity matrices (AASMs) is a compact representation of molecular evolution

  • It has been shown by Altschul [26], that all AASM can be represented in the following way: Sik

Read more

Summary

Introduction

The connection between amino acid properties and molecular evolution was proposed very soon after the discovery of the latter [1,2,3,4]. The evolving proteins retain their structure and function, so by studying the mutations occurring in natural sequences, and observing which properties are conserved in these mutations, one may find which properties are important. Introduced by Dayhoff and Eck in 1968 [13], the AASMs were subsequently developed by several researchers These matrices are used for measuring the similarity of proteins by algorithms such as Smith-Waterman [14] or BLAST [15]. It has been shown that mutation matrices specific to given protein families can exhibit even stronger correlations [19] These findings are good evidence supporting the role amino acid properties play in molecular evolution

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call