The SAMPL-1 hydration free energy blind prediction challenge data set includes 63 compounds that are more chemically diverse, polyfunctional, drug-like, and with examples of transfer free energies and molecular weights larger than ever before seen in previously tabulated data sets of neutral compounds. For the prospective SAMPL-1 study, we employed a continuum model including a boundary element solution of the Poisson equation to describe electrostatic solvation, a molecular surface area-based cost of cavity formation in water, and a continuum Lennard-Jones potential to account for dispersion-repulsion solute-solvent effects. For the latter contribution, continuum van der Waals atom-type coefficients were calibrated and validated on previously available hydration data sets. In the prospective study, this continuum hydration model yielded SAMPL-1 predictions highly correlated with experimental data, albeit with a slope of slightly above 0.5, suggesting a valid model but with a systematic error. Analysis of the major outliers, all overestimating the experimental hydration data, highlights a common structural theme as a possible cause of the prediction errors: densely polar and hydrogen-bond-capable structures, featuring primarily substituted (sulfon)amide groups, often in conjugated systems. By examining analog pairs within the SAMPL-1 data set, it was also noted that certain solvation trends are captured neither by chemical sense nor by our hydration model, which seem too additive. A retrospective analysis of model transferability between hydration data sets as a function of its parameters and complexity indicates that the electrostatic component of the model is fairly transferrable across data sets, but the nonelectrostatic terms are less so. For the chemical space covered in SAMPL-1, absolute prediction errors indicate that the simpler transferrable electrostatics-only model outperforms the more complex model including cavity and continuum dispersion terms. Possible directions to further improve this continuum hydration model are proposed.
Read full abstract