Abstract

Backgroundβ-turns are secondary structure type that have essential role in molecular recognition, protein folding, and stability. They are found to be the most common type of non-repetitive structures since 25% of amino acids in protein structures are situated on them. Their prediction is considered to be one of the crucial problems in bioinformatics and molecular biology, which can provide valuable insights and inputs for the fold recognition and drug design.ResultsWe propose an approach that combines support vector machines (SVMs) and logistic regression (LR) in a hybrid prediction method, which we call (H-SVM-LR) to predict β-turns in proteins. Fractional polynomials are used for LR modeling. We utilize position specific scoring matrices (PSSMs) and predicted secondary structure (PSS) as features. Our simulation studies show that H-SVM-LR achieves Qtotal of 82.87%, 82.84%, and 82.32% on the BT426, BT547, and BT823 datasets respectively. These values are the highest among other β-turns prediction methods that are based on PSSMs and secondary structure information. H-SVM-LR also achieves favorable performance in predicting β-turns as measured by the Matthew's correlation coefficient (MCC) on these datasets. Furthermore, H-SVM-LR shows good performance when considering shape strings as additional features.ConclusionsIn this paper, we present a comprehensive approach for β-turns prediction. Experiments show that our proposed approach achieves better performance compared to other competing prediction methods.

Highlights

  • Secondary structure of proteins consists of basic elements; these elements are a-helices, b-sheets, random coils, and turns. a-helices and b-sheets are considered as regular secondary structure elements while the residues that correspond to turns structures do not form regular secondary structure elements

  • The machine learning methods include BTPRED [13], BetaTpred2 [14], MOLEBRNN [15] and NetTurnP [1], which are based on artificial neural networks (ANNs), Kim’s method based on k-nearest neighbor (KNN) [16], as well as support vector machines (SVMs) based methods, which recently have become popular in the field of b-turns prediction

  • From the results we found that the performance of H-SVM-logistic regression (LR) using a sliding window on both position specific scoring matrices (PSSMs) and predicted secondary structure (PSS) is by far better than using a sliding window on PSSMs only and add the PSS for the central amino acid

Read more

Summary

Introduction

Secondary structure of proteins consists of basic elements; these elements are a-helices, b-sheets, random coils, and turns. a-helices and b-sheets are considered as regular secondary structure elements while the residues that correspond to turns structures do not form regular secondary structure elements. BetaTpred enhances the performance of b-turns prediction by using secondary structure prediction and evolutionary information in form of position specific scoring matrices (PSSMs) as input to the neural networks. Kim’s method encodes protein sequence using a window of up to 9 residues to be used as input to a KNN based method, which is combined with a filter that uses secondary structure predicted with PSIPRED for the central residue. In Zheng and Kurgan’s method a SVM is utilized to predict b-turns using window based information extracted from four predicted secondary structures (PSSs) with a selected set of PSSMs as input to the SVM. The SVMs were aggregated using a linear logistic regression model

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call