Abstract

A mapping of a macromolecule is a prescription to construct a simplified representation of the system in which only a subset of its constituent atoms is retained. As the specific choice of the mapping affects the analysis of all-atom simulations as well as the construction of coarse-grained models, the characterisation of the mapping space has recently attracted increasing attention. We here introduce a notion of scalar product and distance between reduced representations, which allows the study of the metric and topological properties of their space in a quantitative manner. Making use of a Wang–Landau enhanced sampling algorithm, we exhaustively explore such space, and examine the qualitative features of mappings in terms of their squared norm. A one-to-one correspondence with an interacting lattice gas on a finite volume leads to the emergence of discontinuous phase transitions in mapping space, which mark the boundaries between qualitatively different reduced representations of the same molecule.Graphicabstract

Highlights

  • The research area of computational molecular biophysics has experienced, in the past few decades, impressive advancements in two complementary and strictly intertwined fields: on the one hand, the steadily growing and increasingly cheaper computational power has enabled the simulation of ever larger systems with atomistic resolution [1,2]; on the other hand, there has been an explosion of diverse coarse-grained (CG) models [3,4,5], i.e. simpler representations of molecules in terms of relatively few sites interacting through effective potentials: these have filled several gaps between the length- and time-scales of interest and the current capability of all-atom methods to cover them

  • An intuitive example of this concept is given by the representation of a protein structure in terms of its Cα’s, i.e. the alpha carbons of the backbone: this mapping is extensively employed in the development of CG models [16,17](that is, models in which the whole amino acid is represented as a single bead whose position coincides with that of the Cα), but it is extremely common in the analysis of structures sampled in fully atomistic simulations [18, 19]

  • A recent protocol proposed by us [23] revolves around the analysis of an all-atom molecular dynamics (MD) [26,27] simulation trajectory of a protein in terms of a subset of the molecule’s atoms; a physics-driven choice of the latter allows one to identify the one or few mappings that return the most parsimonious yet informative simplified description of the system

Read more

Summary

Introduction

The research area of computational molecular biophysics has experienced, in the past few decades, impressive advancements in two complementary and strictly intertwined fields: on the one hand, the steadily growing and increasingly cheaper computational power has enabled the simulation of ever larger systems with atomistic resolution [1,2]; on the other hand, there has been an explosion of diverse coarse-grained (CG) models [3,4,5], i.e. simpler representations of molecules in terms of relatively few sites interacting through effective potentials: these have filled several gaps between the length- and time-scales of interest and the current capability of all-atom methods to cover them. A recent protocol proposed by us [23] revolves around the analysis of an all-atom molecular dynamics (MD) [26,27] simulation trajectory of a protein in terms of a subset of the molecule’s atoms; a physics-driven choice of the latter allows one to identify the one or few mappings that return the most parsimonious yet informative simplified description of the system Each of these methods can be the most appropriate to investigate specific properties of the system at hand; at the same time, the majority of them performs the search for solutions of an optimisation problem within the overwhelmingly large space of all possible CG representations that can be assigned to the system. The paper is organised as follows: in Sect. 2 we develop a scalar product between decimation mappings of a macromolecular structure in a static conformation, and derive from it a notion of norm and distance in the mapping space; in Sect. 3 we study CG representations in terms of the distribution of values of the squared norm for mappings having a given number of retained sites N , first through random sampling, making use of the Wang–Landau enhanced sampling method; in Sect. 4 we exploit a duality between the problem of mappings of a macromolecule and that of an interacting lattice gas in a finite volume to investigate the properties the molecule’s reduced representations; in Sect. 5 we discuss some topological features of the mapping space; in Sect. 6 we discuss an extension of the structure-based definition of the norm that includes information about the system’s energetics; in Sect. 7 we sum up the results of this work and discuss its future perspectives

Theory
Exploration of the mapping space
Norm distributions
Inner product distributions
Lattice gas analogy and phase transitions
Topology
Topology of the mapping norm space
Topology of mapping entropy space
Extension of the theory to equilibrium sampling: preliminary results
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call