Abstract

We provide an unsupervised adaptive sampling strategy capable of producing μs-timescale molecular dynamics (MD) simulations of large biosystems using many-body polarizable force fields (PFFs). The global exploration problem is decomposed into a set of separate MD trajectories that can be restarted within a selective process to achieve sufficient phase-space sampling. Accurate statistical properties can be obtained through reweighting. Within this highly parallel setup, the Tinker-HP package can be powered by an arbitrary large number of GPUs on supercomputers, reducing exploration time from years to days. This approach is used to tackle the urgent modeling problem of the SARS-CoV-2 Main Protease (Mpro) producing more than 38 μs of all-atom simulations of its apo (ligand-free) dimer using the high-resolution AMOEBA PFF. The first 15.14 μs simulation (physiological pH) is compared to available non-PFF long-timescale simulation data. A detailed clustering analysis exhibits striking differences between FFs, with AMOEBA showing a richer conformational space. Focusing on key structural markers related to the oxyanion hole stability, we observe an asymmetry between protomers. One of them appears less structured resembling the experimentally inactive monomer for which a 6 μs simulation was performed as a basis for comparison. Results highlight the plasticity of the Mpro active site. The C-terminal end of its less structured protomer is shown to oscillate between several states, being able to interact with the other protomer, potentially modulating its activity. Active and distal site volumes are found to be larger in the most active protomer within our AMOEBA simulations compared to non-PFFs as additional cryptic pockets are uncovered. A second 17 μs AMOEBA simulation is performed with protonated His172 residues mimicking lower pH. Data show the protonation impact on the destructuring of the oxyanion loop. We finally analyze the solvation patterns around key histidine residues. The confined AMOEBA polarizable water molecules are able to explore a wide range of dipole moments, going beyond bulk values, leading to a water molecule count consistent with experimental data. Results suggest that the use of PFFs could be critical in drug discovery to accurately model the complexity of the molecular interactions structuring Mpro.

Highlights

  • At the end of December 2019, a novel coronavirus (CoV) that induces severe acute respiratory disease (SARS) was discovered and labeled SARS-CoV-2.1 It causes the disease named COVID19, which led to a global pandemic in 2020 and nally to an urgent global issue.Great effort has been made to gain insights into the action of the virus on the human body

  • When our project started in response to the international High-Performance Computing (HPC) global effort to mitigate the impact of the COVID-19 pandemic,[18,19,20] performing long timescale Molecular Dynamics (MD) simulations using new generations of polarizable force fields (PFFs) on SARS-CoV-2 proteins encompassing hundreds of thousands of atoms, such as Main Protease (Mpro), was out of reach of generalist supercomputers. Such simulations would have required years of computation. To overcome these limitations we introduce a density-driven unsupervised adaptive sampling method based on statistical models and principal component analysis (PCA)

  • We investigated the differences in clustering results, active site volumes, cryptic pockets, key structural activation markers linked to the oxyanion hole structuring, interactions between the C-terminal chain and the active site, and solvation patterns of some key residues

Read more

Summary

Introduction

At the end of December 2019, a novel coronavirus (CoV) that induces severe acute respiratory disease (SARS) was discovered and labeled SARS-CoV-2.1 It causes the disease named COVID19, which led to a global pandemic in 2020 and nally to an urgent global issue. This unsupervised selection step has the advantage of overcoming the critical choice of initial collective variable at the beginning of the simulation reinforcing automation of the sampling scheme This strategy belongs to the family of counts based adaptive sampling algorithms, where one only exploits the number of passages in the different states (micro or macro) visited in the previous iterations to choose which state to restart trajectories from.

Mk PðI
Preparation of systems and choice of initial structures
Simulation protocol
Markers of the structuring of the oxyanion hole
Evaluation of the volumes of the enzyme cavities
Analysis of the local uctuations: high exibility of the Cterminal region
Comparative ligandability analysis: searching for cryptic pockets
Further simulation at lower pH: impact of His172 protonation
Findings
Conclusion and perspectives
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.