Abstract

The prediction of protein-ligand binding free energies is an important goal of computational biochemistry, yet accuracy, reproducibility, and cost remain a problem. Nevertheless, these are essential requirements for computational methods to become standard binding prediction tools in discovery pipelines. Here, we present the results of an extensive search for an optimal method based on an ensemble of umbrella sampling all-atom molecular simulations tested on the phosphorylated tetrapeptide, pYEEI, binding to the SH2 domain, resulting in an accurate and converged binding free energy of -9.0 ± 0.5 kcal/mol (compared to an experimental value of -8.0 ± 0.1 kcal/mol). We find that a minimum of 300 ns of sampling is required for every prediction, a target easily achievable using new generation accelerated MD codes. Convergence is obtained by using an ensemble of simulations per window, each starting from different initial conformations, and by optimizing window-width, orthogonal restraints, reaction coordinate harmonic potentials, and window-sample time. The use of uncorrelated initial conformations in neighboring windows is important for correctly sampling conformational transitions from the unbound to bound states that affect significantly the precision of the calculations. This methodology thus provides a general recipe for reproducible and practical computations of binding free energies for a class of semirigid protein-ligand systems, within the limit of the accuracy of the force field used.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call