We have constructed a bank (FTTP) of tendentious factors of three states of three-peptide units from PDB database based on conformational dihedral angle library and demonstrated that amino acid biases toward protein secondary structure are present in natural protein sequences. Our research results reveal that 20 standard amino acids fall into three groups: nine residues inclined to α-helix with a common character (e.g. direct side chain aliphatic residues or positive/negative charged residues) arrange in three grades, viz EA, QKRLD, and MN, in turn; seven residues are apt to β-strand with 2′-branched side chain aliphatic residues or benzyl-included residues, namely PV, IYTC, and F, in three ranks; and four residues SHWG show a double tendency to both α and β. Noticeably, proline has the strongest ability to form extended conformation, especially the R e value up to 9.5298 at position 3 (Table 3). Thus, biases of codons show an evident tendency in protein folding, where GC-rich codons are mainly in charge of forming contracted conformation, especially the codon's first letter plays a dominant role in translating the genomic GC signature into protein sequences and structures. So, biases of amino acids will play an important role in protein folding, folding codons, refining domain, structure prediction, and structural genomics/proteomics.
Read full abstract