Abstract

Atomistic molecular dynamics (MD) simulations of protein molecules are too computationally expensive to predict most native structures from amino acid sequences. Here, we integrate "weak" external knowledge into folding simulations to predict protein structures, given their sequence. For example, we instruct the computer "to form a hydrophobic core," "to form good secondary structures," or "to seek a compact state." This kind of information has been too combinatoric, nonspecific, and vague to help guide MD simulations before. Within atomistic replica-exchange molecular dynamics (REMD), we develop a statistical mechanical framework, modeling using limited data with coarse physical insight(s) (MELD + CPI), for harnessing weak information. As a test, we apply MELD + CPI to predict the native structures of 20 small proteins. MELD + CPI samples to within less than 3.2 Å from native for all 20 and correctly chooses the native structures (<4 Å) for 15 of them, including ubiquitin, a millisecond folder. MELD + CPI is up to five orders of magnitude faster than brute-force MD, satisfies detailed balance, and should scale well to larger proteins. MELD + CPI may be useful where physics-based simulations are needed to study protein mechanisms and populations and where we have some heuristic or coarse physical knowledge about states of interest.

Highlights

  • Atomistic molecular dynamics (MD) simulations of protein molecules are too computationally expensive to predict most native structures from amino acid sequences

  • How might we guide MD simulations to states of interest when we do not know what those structures are? We describe an approach based on coarse physical insight(s) (CPI), that is, heuristic knowledge about the states of interest

  • The third measure, BestStruc, reports the lowest backbone rmsd of any single structure sampled in the simulations. This test is more specific of just modeling using limited data (MELD) + CPI itself, which helps us to distinguish any flaws of MELD + CPI from flaws of the force field, per se

Read more

Summary

Introduction

Atomistic molecular dynamics (MD) simulations of protein molecules are too computationally expensive to predict most native structures from amino acid sequences. A proper physical model requires a plausible physical energy function that can accurately predict native structures (validation); that applies across many different proteins (transferrable); that satisfies Boltzmann’s law (physical); that scales up to sufficiently large proteins (practical); and, when predicting folding, that begins from the fully unfolded state (to avoid inadvertent biases). These objectives are largely not met by bioinformatics algorithms, which do not satisfy Boltzmann’s law, or by current atomistic simulations, which are too computationally expensive to tackle sizable proteins starting from fully unfolded states. Because our end goal is fundamentally to get proper populations, we seek a method

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.