Abstract

Protein flexibility is useful in structural and functional aspect of proteins. We have analyzed the local primary protein sequence features that in combination can predict the B-value of amino acid residues directly from the protein sequence. We have also analyzed the distribution of B-value in different regions of protein three dimensional structures. On an average, the normalized Bvalue decreases by 0.1055 with every 0.5Å increase in the distance of the residue from protein surface. The residues in the loop regions have higher B-values as compared to the residues present in other regular secondary structural elements. Buried residues which are present in the protein core are more rigid (lower B-values) than the residues present on the protein surface. Similarly, the hydrophobic residues which tend to be present in the protein core have lower average B-value than the polar residues. Finally, we have proposed the method based on Support Vector Regression (SVR) to predict the B-value from protein primary sequence. Our result shows that, the SVR model achieved the correlation coefficient of 0.47 which is comparable to existing methods.

Highlights

  • Protein structures are dynamic molecules which are in constant motion

  • In theory the protein flexibility is studied by computational models of structure dynamics, atomic normal mode analysis (NMA) and by simulations of molecular motion, while experimentally it is probed by techniques such as nuclear magnetic resonance relaxation times, incoherent neutron scattering and X-ray structure B-values (Temperature factor or B- values) [1,2,3,4,5]

  • B-value reported in experimental atomicresolution structures represents the decrease of intensity in diffraction due to the dynamic disorder caused by the temperature-dependent vibration of the atoms and the static disorder, which is related to the orientation of the molecule [6]

Read more

Summary

Background

Protein structures are dynamic molecules which are in constant motion. The protein motion or flexibility is highly correlated with various biological processes such as molecular recognition and catalytic activity. Protein Data Bank [19] (PDB) located at Research Collaboratory Altogether thirty three sequence features were used as for Structural Bioinformatics (RCSB) having resolution better attributes for implementing the SVR model. Each residue is resolution protein structures containing B-values data were represented by a vector of length 36; 21 for amino acid type taken from Protein Data Bank (PDB) [19]. Biochemical features and 10 for amino acid class based on the similarity of the environment of each amino acid residue in Calculation of normalized B-value protein structures [30]. In 5-fold cross-validation, the training dataset was proteins) server located at http://sts.bioengr.uic.edu/castp/ spilt into 5 subsets, where one of the subsets was used as the was used to identify clefts and cavities for each protein test set while the other 4 subsets were used for training the SVR structure [22]. The default probe radius of 1.4 Å has been used for our calculations

Performance measure
Findings
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.