A base measure of precision for protein stability predictors: structural sensitivity

Octav Caldararu,Tom L Blundell,Kasper P Kepp

doi:10.1186/s12859-021-04030-w

Octav Caldararu, Tom L Blundell + Show 1 more

Open Access

https://doi.org/10.1186/s12859-021-04030-w

Copy DOI

Abstract

BackgroundPrediction of the change in fold stability (ΔΔG) of a protein upon mutation is of major importance to protein engineering and screening of disease-causing variants. Many prediction methods can use 3D structural information to predict ΔΔG. While the performance of these methods has been extensively studied, a new problem has arisen due to the abundance of crystal structures: How precise are these methods in terms of structure input used, which structure should be used, and how much does it matter? Thus, there is a need to quantify the structural sensitivity of protein stability prediction methods.ResultsWe computed the structural sensitivity of six widely-used prediction methods by use of saturated computational mutagenesis on a diverse set of 87 structures of 25 proteins. Our results show that structural sensitivity varies massively and surprisingly falls into two very distinct groups, with methods that take detailed account of the local environment showing a sensitivity of ~ 0.6 to 0.8 kcal/mol, whereas machine-learning methods display much lower sensitivity (~ 0.1 kcal/mol). We also observe that the precision correlates with the accuracy for mutation-type-balanced data sets but not generally reported accuracy of the methods, indicating the importance of mutation-type balance in both contexts.ConclusionsThe structural sensitivity of stability prediction methods varies greatly and is caused mainly by the models and less by the actual protein structural differences. As a new recommended standard, we therefore suggest that ΔΔG values are evaluated on three protein structures when available and the associated standard deviation reported, to emphasize not just the accuracy but also the precision of the method in a specific study. Our observation that machine-learning methods deemphasize structure may indicate that folded wild-type structures alone, without the folded mutant and unfolded structures, only add modest value for assessing protein stability effects, and that side-chain-sensitive methods overstate the significance of the folded wild-type structure.

Highlights

Prediction of the change in fold stability (ΔΔG) of a protein upon mutation is of major importance to protein engineering and screening of disease-causing variants
Our observation that machine-learning methods deemphasize structure may indicate that folded wild-type structures alone, without the folded mutant and unfolded structures, only add modest value for assessing protein stability effects, and that side-chain-sensitive methods overstate the significance of the folded wild-type structure
Structural sensitivity measured for the full proteins The 25 proteins were subjected to computational saturated mutagenesis, started from each of the selected structures for each protein

Summary

Introduction

Prediction of the change in fold stability (ΔΔG) of a protein upon mutation is of major importance to protein engineering and screening of disease-causing variants. An important distinction can be made between those methods that only use the protein amino-acid sequence to predict stability and those that use a three-dimensional wild-type structure as input. The worse-than-expected performance of structure-based methods can relate directly to the quality of the structures used It has been long debated whether crystal structures reproduce the native structures of proteins in solution and cells, as structures could be affected by crystal packing effects [24, 25]. Databases such as ProTherm [26] and VariBench [27] annotate each experimental data point with a Protein Data Bank (PDB) [28] code that may not represent the best structure if more structures are available, and this could affect the computed ΔΔG value

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Feb 25, 2021
Citations: 32	License type: open-access

R Discovery Prime

R Discovery Prime

A base measure of precision for protein stability predictors: structural sensitivity

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Author response: Rapid protein stability prediction using deep learning representations
Lasse M Blaabjerg ... Lydia L Good
-
Lasse M Blaabjerg, et. al.Lasse M Blaabjerg ... Lydia L Good
09 May 2023
09 May 2023

Machine Learning Algorithms for Predicting Protein Folding Rates and Stability of Mutant Proteins: Comparison with Statistical Methods
M Michael Gromiha ... Liang-Tsung Huang
Current Protein & Peptide Science | VOL. 12
M Michael Gromiha, et. al.M Michael Gromiha ... Liang-Tsung Huang
01 Sep 2011
Current Protein & Peptide Science | VOL. 12

Using AI-predicted protein structures as a reference to predict loss-of-function activity in tumor suppressor breast cancer genes
Rohan Gnanaolivu ... Steven N Hart
Computational and Structural Biotechnology Journal | VOL. 23
Rohan Gnanaolivu, et. al.Rohan Gnanaolivu ... Steven N Hart
05 Oct 2024
Computational and Structural Biotechnology Journal | VOL. 23

Nonlinear dynamics of milling processes
B Balachandran
Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences | VOL. 359
B BalachandranB Balachandran
15 Apr 2001
Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences | VOL. 359

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A base measure of precision for protein stability predictors: structural sensitivity

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics