In the process of protein folding the dynamics is heterogeneous and folding nuclei play a crucial role. Such folding nuclei are identified, for example, by -value analysis in protein engineering experiments. After recent intensive discussions it has been concluded that the sequence of amino acids at the folding nucleus is not evolutionally conserved. In other words, the amino acid at the folding nucleus is variable even in the group of proteins which have almost the same structure and are expected to have the same folding nucleus. Then we try to find an evolutionally conserved quantity, other than the sequence, specifying folding nuclei in this paper. As a case study we focus our attention to a small protein (TNfn3, PDB code: 1ten), since its folding nuclei are exhaustively studied. The data of the amino-acid sequence and the native state structure is taken from the Protein Data Bank (PDB). Our mini-protein, 1ten, consists of 7 -sheets and there are 6 folding nuclei, (the type of the amino acid; the residue number in this paper, the residue number in PDB): (ILE; 20, 821), (TYR; 36, 837), (ILE; 48, 849), (LEU; 50, 851), (ILE; 59, 860), and (VAL; 70, 871), on the sheets. The residue number is the ordinal number of the amino acid along the sequence. Since folding nuclei are not easily decided from onedimensional (1D) information of the sequence, we consider three-dimensional (3D) information of the structure. A similar situation has been encountered in the study of the helix-turn-helix motif and it has been shown that 3D keynote, which consists of the list of the number of interactions between pairs of amino acids, can characterize the structure of the motif. The interaction is assigned to each pairs of atoms, excluding hydrogen, if the diameter of the pair is smaller than 6 A. Such a list reflects 3D information of the structure. We have made the 3D keynote for 1ten but it has not worked well for specifying the folding nuclei. Then in the following we modify the 3D keynote to be suited for describing protein folding. In the case of protein folding in general it has been recognized for a long time that the formation of hydrophobic core plays an essential role. However, this entropic effect is not taken into account in the analysis of 3D keynote. Thus we introduce a 3D hydrophobicity in order to examine the hydrophobic effect. The 3D hydrophobicity for an amino acid is defined as the sum of the values of hydrophobicity of amino acids within 12 A from it. This 3D hydrophobicity reflects the structure of the native state. The structure is decided by interactions, including entropic effect, among amino acids and the folding nuclei are related to the formation of hydrophobic core. In Fig. 1(a) the 3D-hydrophobicity profile (HP) is shown as a function of the sequence. The 3D hydrophobicity takes large value at folding nucleus so that it can specify the folding nuclei. In order to see the evolutional property of the 3D-HP we compare several proteins which have similar structure to 1ten. The similarity of the structure is analyzed by the HSSP server. We plot several profiles of the proteins with the highest Z-scores in HSSP compared with 1ten in Fig. 1(b). From this plot it can be concluded that the 3D-HP is common among these proteins and evolutionally conserved. In our study we have focused our attention to the folding nuclei on secondary structures, -sheets, in consistent with the S-value analysis. Here we have assumed a hierarchical picture of the folding where the weak residual interactions determine the 3D structure among secondary structures, while the stronger interactions, for example hydrogen bonding, lead to secondary structures before the process of 3D structure formation. We have shown that the 3D-HP is evolutionally conserved. Then we examine the correlation between the 3D-HP and the -value at the folding nuclei. An amino acid with large -value plays a role of the folding nucleus. We see a tendency that the larger -values are observed at the middle part of the amino-acid sequence: (the residue number; the value, the 3D hydrophobicity): (20; 0.38, 23.77), (36; 0.56, 18.27), (48; 0.67, 14.92), (50; 0.42, 12.66), (59; 0.62, 16.71), and (70; 0.54, 27.37). Thus we consider the buriedness of amino acids in the native state measured by the contact distance. The contact distance is the difference in residue 0 10 20 30 40 50 60 70 80 90 −5 0 5 10 15 20 25 30
Read full abstract