Abstract

BackgroundQuantitative descriptions of amino acid similarity, expressed as probabilistic models of evolutionary interchangeability, are central to many mainstream bioinformatic procedures such as sequence alignment, homology searching, and protein structural prediction. Here we present a web-based, user-friendly analysis tool that allows any researcher to quickly and easily visualize relationships between these bioinformatic metrics and to explore their relationships to underlying indices of amino acid molecular descriptors.ResultsWe demonstrate the three fundamental types of question that our software can address by taking as a specific example the connections between 49 measures of amino acid biophysical properties (e.g., size, charge and hydrophobicity), a generalized model of amino acid substitution (as represented by the PAM74-100 matrix), and the mutational distance that separates amino acids within the standard genetic code (i.e., the number of point mutations required for interconversion during protein evolution). We show that our software allows a user to recapture the insights from several key publications on these topics in just a few minutes.ConclusionOur software facilitates rapid, interactive exploration of three interconnected topics: (i) the multidimensional molecular descriptors of the twenty proteinaceous amino acids, (ii) the correlation of these biophysical measurements with observed patterns of amino acid substitution, and (iii) the causal basis for differences between any two observed patterns of amino acid substitution. This software acts as an intuitive bioinformatic exploration tool that can guide more comprehensive statistical analyses relating to a diverse array of specific research questions.

Highlights

  • Quantitative descriptions of amino acid similarity, expressed as probabilistic models of evolutionary interchangeability, are central to many mainstream bioinformatic procedures such as sequence alignment, homology searching, and protein structural prediction

  • It would be trivial to find an equivalent set of example analyses that focused on protein folding or homology searching

  • Our visualization software can be used to investigate any area of bioinformatics that builds on understanding how amino acids' molecular descriptors influence the patterns by which amino acids substitute for one another during evolution

Read more

Summary

Results

We present three simple, example analyses to illustrate the types of exploration that our software allows. Our quick analysis indicates that physiochemical considerations really are, more important to long-term protein evolution than can be explained by codon assignments (in that the physiochemical properties are more strongly correlated with observed substitution patterns than with mutational distance within the genetic code; i.e., physiochemical similarity comes to dominate patterns of substitution as evolution proceeds) This same feature of the AAIndex Explorer tool could well be used to quickly visualize which properties (and which amino acids) are responsible for the difference between any two substitution matrices (e.g., between a "generalized" or global model of amino acid substitution, as found in a PAM or BLOSUM matrix, and any observed pattern of interchange within a specific protein family or phyletic lineage). The minimum spanning tree of size, charge, and hydrophobicity (Figure 2) is recolored to indicate the similarity of each amino acid index to the PAM74-100 amino acid substitution matrix [5]

Conclusion
Background
15. Grantham R
17. Fitch WM
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call