Abstract

Protein folding rate is one of the important properties of a protein. Protein folding rate prediction is useful for understanding protein folding process and guiding protein design. In this study, we developed a support vector machine (SVM) based method to predict protein folding kinetic types (two-state or non-two-state) and the real-value folding rate using the features calculated from the three-dimensional structure such as contact order, various properties from the non-local contact clusters, secondary structural information and sequence length. We systematically studied the contributions of individual features to folding rate prediction. Based on the highest contributions of individual features, we trained our machine using leave one out cross-validation and tested on a testing dataset. The Pearson correlation coefficient, mean absolute difference and root mean square error between the predicted and experimental folding rates (base-10 logarithmic scale) are 0.814, 0.752 and 0.910 for two-state proteins, and 0.860, 0.687 and 0.876 for non-two-state proteins. Moreover, our method predicts whether a protein of known atomic structure folds according to two-state or non-two-state kinetics and correctly classifies 80% of the folding mechanism on a testing dataset. Finally, we evaluated the performance of our method along with the other eight existing protein folding rate prediction tools on non-overlapping benchmarking dataset. The prediction performance will also be reported and discussed.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call