Abstract
Protein folding and design are major biophysical problems, the solution of which would lead to important applications especially in medicine. Here we provide evidence of how a novel parametrization of the Caterpillar model may be used for both quantitative protein design and folding. With computer simulations it is shown that, for a large set of real protein structures, the model produces designed sequences with similar physical properties to the corresponding natural occurring sequences. The designed sequences require further experimental testing. For an independent set of proteins, previously used as benchmark, the correct folded structure of both the designed and the natural sequences is also demonstrated. The equilibrium folding properties are characterized by free energy calculations. The resulting free energy profiles not only are consistent among natural and designed proteins, but also show a remarkable precision when the folded structures are compared to the experimentally determined ones. Ultimately, the updated Caterpillar model is unique in the combination of its fundamental three features: its simplicity, its ability to produce natural foldable designed sequences, and its structure prediction precision. It is also remarkable that low frustration sequences can be obtained with such a simple and universal design procedure, and that the folding of natural proteins shows funnelled free energy landscapes without the need of any potentials based on the native structure.
Highlights
Computer simulations of the protein folding process have in the last ten years reached amazing level of description and accuracy [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]
We will show below the model is capable of refolding several natural sequences, demonstrating that in the model the presence of repeats is not necessary to stabilize natural protein structures
To the best of our knowledge our coarse-grained protein model is the simplest, in terms of the number of parameters needed, with a transferable energy function capable of achieving such precision for the prediction of the native folded structures. It is one of the very few models that allows for both quantitative proteins design and folding, the latter demonstrated by free energy calculations
Summary
Computer simulations of the protein folding process have in the last ten years reached amazing level of description and accuracy [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]. De Novo Protein Folding and Design the‘‘minimal frustration principle’’ (MFP) [18,19,20] in which protein folding is described as a downhill sliding process in a low frustration energy landscape (‘‘funnelled’’ shaped) towards the native state. While MFP has been proven for lattice heteropolymers [19, 21,22,23,24,25,26,27], in more realistic protein representations a residual frustration which prevents the systematic prediction of the native structure of natural sequences is often observed. The group of David Baker [38] introduced a novel procedure to select sequences with low frustration capable of correctly refolding in vitro to their target structure with a success rate between 8% and up to 40% of the total trials. The complexity of Baker’s procedure demonstrates that is not easy to produce sequences with low frustration
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have