Neural Upscaling from Residue-Level Protein Structure Networks to Atomistic Structures.

Vy T. Duong,Gianmarc Grazioli,Rachel W. Martin,Carter T. Butts,Elizabeth M. Diessner

doi:10.3390/biom11121788

Vy T. Duong, Gianmarc Grazioli + Show 3 more

Open Access

https://doi.org/10.3390/biom11121788

Copy DOI

Abstract

Coarse-graining is a powerful tool for extending the reach of dynamic models of proteins and other biological macromolecules. Topological coarse-graining, in which biomolecules or sets thereof are represented via graph structures, is a particularly useful way of obtaining highly compressed representations of molecular structures, and simulations operating via such representations can achieve substantial computational savings. A drawback of coarse-graining, however, is the loss of atomistic detail—an effect that is especially acute for topological representations such as protein structure networks (PSNs). Here, we introduce an approach based on a combination of machine learning and physically-guided refinement for inferring atomic coordinates from PSNs. This “neural upscaling” procedure exploits the constraints implied by PSNs on possible configurations, as well as differences in the likelihood of observing different configurations with the same PSN. Using a 1 μs atomistic molecular dynamics trajectory of A, we show that neural upscaling is able to effectively recapitulate detailed structural information for intrinsically disordered proteins, being particularly successful in recovering features such as transient secondary structure. These results suggest that scalable network-based models for protein structure and dynamics may be used in settings where atomistic detail is desired, with upscaling employed to impute atomic coordinates from PSNs.

Highlights

Proteins and other biological macromolecules exhibit a wide variety of complex dynamics and interactions at varying size and time scales
While atomistic molecular dynamics (MD) models currently serve as the gold standard tools for simulating dynamics at high resolution, the cost of large-scale MD simulations limits their use to relatively small systems on time scales of microseconds or less
Corresponding roughly to one bead per four heavy atoms, with hydrogens left implicit; MARTINI and other CG MD models have proven useful in studying the structure and dynamics of large complexes, lipid phases, and other systems that are too large to be treated with atomistic MD methods [2]

Summary

Introduction

Proteins and other biological macromolecules exhibit a wide variety of complex dynamics and interactions at varying size and time scales. While atomistic molecular dynamics (MD) models currently serve as the gold standard tools for simulating dynamics at high resolution (with some inroads by quantum mechanical methods in small-scale or specialized applications), the cost of large-scale MD simulations limits their use to relatively small systems on time scales of microseconds or less. Coarse-grained (CG) models offer a means of accessing larger system sizes and longer time scales, sacrificing atomistic detail in exchange for reduced computational cost. Corresponding roughly to one bead per four heavy atoms, with hydrogens left implicit; MARTINI and other CG MD models have proven useful in studying the structure and dynamics of large complexes, lipid phases, and other systems that are too large to be treated with atomistic MD methods [2].

Methods

Results

Discussion

Conclusion