Abstract
Recent developments in computer processing power lead to new paradigms of how problems in many-body physics and especially polymer physics can be addressed. Parallel processors can be exploited to generate millions of molecular configurations in complex environments at a second, and concomitant free-energy landscapes can be estimated. Databases that are complete in terms of polymer sequences and architecture form a powerful training basis for cross-checking and verifying machine learning-based models. We employ an exhaustive enumeration of polymer sequence space to benchmark the prediction made by a neural network. In our example, we consider the translocation time of a copolymer through a lipid membrane as a function of its sequence of hydrophilic and hydrophobic units. First, we demonstrate that massively parallel Rosenbluth sampling for all possible sequences of a polymer allows for meaningful dynamic interpretation in terms of the mean first escape times through the membrane. Second, we train a multi-layer neural network on logarithmic translocation times and show by the reduction of the training set to a narrow window of translocation times that the neural network develops an internal representation of the physical rules for sequence-controlled diffusion barriers. Based on the narrow training set, the network result approximates the order of magnitude of translocation times in a window that is several orders of magnitude wider than the training window. We investigate how prediction accuracy depends on the distance of unexplored sequences from the training window.
Highlights
Polymers are many-body physical objects; in order to describe their equilibrium state and dynamics, it is often required to translate chemical sequence information into free-energy landscapes in three-dimensional space
Our results confirm that polymer translocation is controlled by a balance of the overall hydrophobicity of the polymer and is inhibited by adsorption at the bilayer–solvent interfaces[26,27,28,31], which is consistent with the picture for small solutes[67] and larger solid objects such as carbon nanotubes[68]
Amphiphilic polymers at a balanced hydrophobicity show the smallest translocation times when the sequence exposes small repeating amphiphilic features, while longest waiting times are associated with a diblock structure of the whole chain
Summary
Polymers are many-body physical objects; in order to describe their equilibrium state and dynamics, it is often required to translate chemical sequence information into free-energy landscapes in three-dimensional space. The sequence space available by current polymer chemistry[1,2,3] or in biopolymers exceeds the limits for closed physical descriptions and is not accessible for complete scans by molecular simulation techniques. A prominent problem for sequence-controlled polymers is their transport through lipid membranes and biological barriers, which is linked to a wide field of potential biomedical and biotechnological applications. The translocation time of polymer chains through a narrow nano-pore on the scale of one monomer has been described for homopolymers[15,16] by means of scaling relations and, later on, extended the theory to block copolymers[17,18]. The absence of a closed analytic theory for sequence-controlled translocation does not exclude technical applications of nano-pores for DNA sequencing[21,22,23]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have