Abstract

Protein structural class could provide important clues for understanding protein fold, evolution and function. However, it is still a challenging problem to accurately predict protein structural classes for low-similarity sequences. This paper was devoted to develop a powerful method to predict protein structural classes for low-similarity sequences. On the basis of a very objective and strict benchmark dataset, we firstly extracted optimal tripeptide compositions (OTC) which was picked out by using feature selection technique to formulate protein samples. And an overall accuracy of 91.1% was achieved in jackknife cross-validation. Subsequently, we investigated the accuracies of three popular features: position-specific scoring matrix (PSSM), predicted secondary structure information (PSSI) and the average chemical shift (ACS) for comparison. Finally, to further improve the prediction performance, we examined all combinations of the four kinds of features and achieved the maximum accuracy of 96.7% in jackknife cross-validation by combining OTC with ACS, demonstrating that the model is efficient and powerful. Our study will provide an important guide to extract valuable information from protein sequences.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call