Abstract

Computational protein design has the ambitious goal of crafting novel proteins that address challenges in biology and medicine. To overcome these challenges, the computational protein modeling suite Rosetta has been tailored to address various protein design tasks. Recently, statistical methods have been developed that identify correlated mutations between residues in a multiple sequence alignment of homologous proteins. These subtle inter-dependencies in the occupancy of residue positions throughout evolution are crucial for protein function, but we found that three current Rosetta design approaches fail to recover these co-evolutionary couplings. Thus, we developed the Rosetta method ResCue (residue-coupling enhanced) that leverages co-evolutionary information to favor sequences which recapitulate correlated mutations, as observed in nature. To assess the protocols via recapitulation designs, we compiled a benchmark of ten proteins each represented by two, structurally diverse states. We could demonstrate that ResCue designed sequences with an average sequence recovery rate of 70%, whereas three other protocols reached not more than 50%, on average. Our approach had higher recovery rates also for functionally important residues, which were studied in detail. This improvement has only a minor negative effect on the fitness of the designed sequences as assessed by Rosetta energy. In conclusion, our findings support the idea that informing protocols with co-evolutionary signals helps to design stable and native-like proteins that are compatible with the different conformational states required for a complex function.

Highlights

  • Proteins play a vital role in fundamental processes of life, and their diverse three-dimensional structures allow for highly diverse functions

  • The number of conserved residues is small in many proteins and not all important residues can be captured by conservation analysis

  • Nowadays, advanced methods allow us to deduce these networks from multiple sequence alignments

Read more

Summary

Introduction

Proteins play a vital role in fundamental processes of life, and their diverse three-dimensional structures allow for highly diverse functions. A critical element of Rosetta is a scoring function that is fine-tuned to respect knowledge-based statistics and physical approximations. Without additional restraints, this scoring function reflects the thermodynamic stability of one static protein conformation in a distinct environment [6]. Protein function often relies on structural flexibility [7], multiple Rosetta protocols have been developed to favor sequences which do thermostabilize and account for protein flexibility. The MSD implementation RECON [8, 11] optimizes in an iterative protocol the individual sequences of the conformational states. Each iteration increases a restraint to converge the individually designed sequences into a single sequence that supports all conformations

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call