Amino acid propensities for protein secondary structures are vital for protein structure prediction, understanding folding, and design, and have been studied using various theoretical and experimental methods. Traditional assessments of average propensities using statistical methods have been done on relatively smaller dataset for only a few secondary structures. They also involve averaging out the environmental factors and lack insights into consistency of preferences across diverse protein structures. While a few studies have explored variations in propensities across protein structural classes and folds, exploration of such variations across protein structures remains to be carried out. In this work, we have revised the average propensities for all six different secondary structures, namely α-helix, β-strand, 310-helix, π-helix, turn and coil, analyzing the most exhaustive dataset available till date using two robust secondary structure assignment algorithms, DSSP and STRIDE. The propensities evaluated here can serve as a standard reference. Moreover, we present here, for the first time, the propensities within individual protein structures and investigated how the preferences of residues and more interestingly, of their groups formed based on their structural features, vary across different unique structures. We devised a novel approach- the minimal set analysis, based on the propensity distribution of residues, which along with the group propensities led us to the conclusion that a residue's preference for a specific secondary structure is primarily dictated by its side chain's structural features. The findings in this study provide a more insightful picture of residues propensities and can be useful in protein folding and design studies.
Read full abstract