Abstract

BackgroundProtein destabilization is a common mechanism by which amino acid substitutions cause human diseases. Although several machine learning methods have been reported for predicting protein stability changes upon amino acid substitutions, the previous studies did not utilize relevant sequence features representing biological knowledge for classifier construction.ResultsIn this study, a new machine learning method has been developed for sequence feature-based prediction of protein stability changes upon amino acid substitutions. Support vector machines were trained with data from experimental studies on the free energy change of protein stability upon mutations. To construct accurate classifiers, twenty sequence features were examined for input vector encoding. It was shown that classifier performance varied significantly by using different sequence features. The most accurate classifier in this study was constructed using a combination of six sequence features. This classifier achieved an overall accuracy of 84.59% with 70.29% sensitivity and 90.98% specificity.ConclusionsRelevant sequence features can be used to accurately predict protein stability changes upon amino acid substitutions. Predictive results at this level of accuracy may provide useful information to distinguish between deleterious and tolerant alterations in disease candidate genes. To make the classifier accessible to the genetics research community, we have developed a new web server, called MuStab (http://bioinfo.ggc.org/mustab/).

Highlights

  • Protein destabilization is a common mechanism by which amino acid substitutions cause human diseases

  • Our results indicate that accurate support vector machine (SVM) models can be constructed by using relevant sequence features for input vector encoding

  • The novelty of our method lies in the use of sequence features representing biological knowledge for input encoding

Read more

Summary

Introduction

Protein destabilization is a common mechanism by which amino acid substitutions cause human diseases. Amino acid substitutions can cause a series of changes to normal protein function, such as geometric constraint changes, physico-chemical effects, and disruption of salt bridges or hydrogen bonds [1]. These changes may lead to protein destabilization or some abnormal biological functions. Previous studies suggest that each person may have 24,000 – 40,000 non-synonymous Single Nucleotide Polymorphisms (nsSNPs), and there are a total of 67,000 – 200,000 common nsSNPs in the human population [2] These nsSNPs give rise to amino acid substitutions in proteins.

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call