Abstract

Software to predict the change in protein stability upon point mutation is a valuable tool for a number of biotechnological and scientific problems. To facilitate the development of such software and provide easy access to the available experimental data, the ProTherm database was created. Biases in the methods and types of information collected has led to disparity in the types of mutations for which experimental data is available. For example, mutations to alanine are hugely overrepresented whereas those involving charged residues, especially from one charged residue to another, are underrepresented. ProTherm subsets created as benchmark sets that do not account for this often underrepresent tense certain mutational types. This issue introduces systematic biases into previously published protocols’ ability to accurately predict the change in folding energy on these classes of mutations. To resolve this issue, we have generated a new benchmark set with these problems corrected. We have then used the benchmark set to test a number of improvements to the point mutation energetics tools in the Rosetta software suite.

Highlights

  • The ability to accurately predict the stability of a protein upon mutation is important for numerous problems in protein engineering and medicine including stabilization and activity optimization of biologic drugs

  • To compare the composition of these benchmark sets to that of the database we examined the curated ProTherm (ProTherm∗) provided by Ó Conchúir et al (2015)1 which is a selection of entries containing only mutations which occur on a single chain and provide experimental G values (Supplementary Table 2)

  • We describe a number of issues in previous benchmark sets used to assess the quality of protein stability prediction software

Read more

Summary

Introduction

The ability to accurately predict the stability of a protein upon mutation is important for numerous problems in protein engineering and medicine including stabilization and activity optimization of biologic drugs To perform this task a number of strategies and force fields have been developed, including those that perform exclusively on sequence (Casadio et al, 1995; Capriotti et al, 2005; Kumar et al, 2009) as well as those that involve sophisticated physical force fields both knowledge based (Sippl, 1995; Gilis and Rooman, 1996; Potapov et al, 2009), physical models (Pitera and Kollman, 2000; Pokala and Handel, 2005; Benedix et al, 2009), and hybrids (Pitera and Kollman, 2000; Guerois et al, 2002; Kellogg et al, 2011; Jia et al, 2015; Park et al, 2016; Quan et al, 2016). At the time of this writing it contains 26,045 entries.

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.