PSSP-RFE: Accurate Prediction of Protein Structural Class by Recursive Feature Extraction from PSI-BLAST Profile, Physical-Chemical Property and Functional Annotations

Liqi Li,Yue Zhou,Hua Yang,Xiaoqi Zheng,Yuan Zhang,Zhong Luo,Xiang Cui,Sanjiu Yu

doi:10.1371/journal.pone.0092863

Abstract

Protein structure prediction is critical to functional annotation of the massively accumulated biological sequences, which prompts an imperative need for the development of high-throughput technologies. As a first and key step in protein structure prediction, protein structural class prediction becomes an increasingly challenging task. Amongst most homological-based approaches, the accuracies of protein structural class prediction are sufficiently high for high similarity datasets, but still far from being satisfactory for low similarity datasets, i.e., below 40% in pairwise sequence similarity. Therefore, we present a novel method for accurate and reliable protein structural class prediction for both high and low similarity datasets. This method is based on Support Vector Machine (SVM) in conjunction with integrated features from position-specific score matrix (PSSM), PROFEAT and Gene Ontology (GO). A feature selection approach, SVM-RFE, is also used to rank the integrated feature vectors through recursively removing the feature with the lowest ranking score. The definitive top features selected by SVM-RFE are input into the SVM engines to predict the structural class of a query protein. To validate our method, jackknife tests were applied to seven widely used benchmark datasets, reaching overall accuracies between 84.61% and 99.79%, which are significantly higher than those achieved by state-of-the-art tools. These results suggest that our method could serve as an accurate and cost-effective alternative to existing methods in protein structural classification, especially for low similarity datasets.

Highlights

As the basic compositions of life, proteins play a central role in most cellular functions such as gene regulation, metabolism and cell proliferation
We propose a novel computation method that combines Support Vector Machine (SVM) with PSI-BLAST profile, physical-chemical property and functional annotations to further improve the prediction of protein structural class
Parameter selection In this study, we used a grid search strategy to select the parameters in LIBSVM, which depend on the dimension Dim of the top feature vector of proteins

Summary

Introduction

As the basic compositions of life, proteins play a central role in most cellular functions such as gene regulation, metabolism and cell proliferation. In order to interpret the function of a new protein sequence, it is fundamental to understand its 3D structure. Since the knowledge of protein structural class provides useful information towards the determination of its 3D structure, prediction of protein structural class from sequence data becomes a hot topic in computational biology, especially with the development of high-throughput technologies [1]. Proteins have irregular surfaces and complex 3D structures, but they are formed regularly in regional fold patterns at secondary structure level. Based on the contents of their secondary structures, known protein structures are classified into four categories, all-a, all-b, a/b and a+b. All-a and all-b proteins consist of only ahelices and b-strands, respectively. Experimental approaches to determining the structure information of a protein, including X-ray Diffraction and Nuclear Magnetic Resonance, are costly and time-consuming, and not capable of completely meeting researchers’ demands. Highthroughput computational approaches are brought to the forefront of this issue

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLoS ONE	Publication Date: Mar 27, 2014
Citations: 60	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

PSSP-RFE: Accurate Prediction of Protein Structural Class by Recursive Feature Extraction from PSI-BLAST Profile, Physical-Chemical Property and Functional Annotations

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS ONE

Lead the way for us

Similar Papers

Improving the Prediction of Protein Structural Class for Low-Similarity Sequences by Incorporating Evolutionaryand Structural Information
Liang Kong ... Rong Jing
Journal of Advanced Computational Intelligence and Intelligent Informatics | VOL. 20
Liang Kong, et. al.Liang Kong ... Rong Jing
19 May 2016
Journal of Advanced Computational Intelligence and Intelligent Informatics | VOL. 20

Recent Advances in the Prediction of Protein Structural Classes: Feature Descriptors and Machine Learning Algorithms
Lin Zhu ... Mehdi D Davari
Crystals | VOL. 11
Lin Zhu, et. al.Lin Zhu ... Mehdi D Davari
24 Mar 2021
Crystals | VOL. 11

A protein structural classes prediction method based on PSI-BLAST profile
Shuyan Ding ... Yuhua Yao
Journal of Theoretical Biology | VOL. 353
Shuyan Ding, et. al.Shuyan Ding ... Yuhua Yao
04 Mar 2014
Journal of Theoretical Biology | VOL. 353

Protein structural class prediction using support vector machine
Gazi Mohammad Shafiullah ... Hawlader Abdullah Al-Mamun
-
Gazi Mohammad Shafiullah, et. al.Gazi Mohammad Shafiullah ... Hawlader Abdullah Al-Mamun
01 Dec 2010
01 Dec 2010

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

PSSP-RFE: Accurate Prediction of Protein Structural Class by Recursive Feature Extraction from PSI-BLAST Profile, Physical-Chemical Property and Functional Annotations

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS ONE