Abstract

The classification of proteinogenic amino acids is crucial for understanding their commonalities as well as their differences to provide a hint for why life settled on the usage of precisely those amino acids. It is also crucial for predicting electrostatic, hydrophobic, stacking and other interactions, for assessing conservation in multiple alignments and many other applications. While several methods have been proposed to find “the” optimal classification, they have several shortcomings, such as the lack of efficiency and interpretability or an unnecessarily high number of discriminating features. In this study, we propose a novel method involving a repeated binary separation via a minimum amount of five features (such as hydrophobicity or volume) expressed by numerical values for amino acid characteristics. The features are extracted from the AAindex database. By simple separation at the medians, we successfully derive the five properties volume, electron–ion-interaction potential, hydrophobicity, α-helix propensity, and π-helix propensity. We extend our analysis to separations other than by the median. We further score our combinations based on how natural the separations are.

Highlights

  • The classification of proteinogenic amino acids is crucial for understanding their commonalities as well as their differences to provide a hint for why life settled on the usage of precisely those amino acids

  • We focus on the classification of the 20 proteinogenic amino acids, the building blocks of proteins

  • This helps us understand why some amino acids show a preference for occurrence in certain secondary structures of p­ roteins[1], develop electrostatic, hydrophobic, stacking and other interactions, or are more exchanged against each other in a coding sequence, which is the principle underlying substitution ­matrices[2,3]

Read more

Summary

Introduction

The classification of proteinogenic amino acids is crucial for understanding their commonalities as well as their differences to provide a hint for why life settled on the usage of precisely those amino acids It is crucial for predicting electrostatic, hydrophobic, stacking and other interactions, for assessing conservation in multiple alignments and many other applications. Ramsay ­Taylor[5] introduced a classification scheme of eight physicochemical measures to classify the 20 amino acids and organized this scheme into an Euler diagram based on the work of Dickerson and ­Geis[6] (Fig. 1). He aimed at an improved description of amino acid relatedness in protein sequence alignments.

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call