Accurate force field parameters, potential energy functions, and receptor-ligand models are essential for modeling the solvation and binding of drug-like molecules to a receptor. A large and ever-growing chemical space of medicinally relevant scaffolds has also required these factors, especially force field parameters, to be highly transferable. Generalized force fields such as the CHARMM General Force Field (CGenFF) and the generalized AMBER force field (GAFF) have accomplished this feat along with other contemporaneous ones like OPLS. Here, we analyze the limits in the parametrization of drug-like small molecules by CGenFF and GAFF in terms of the various functional groups represented within them. Specifically, we link the presence of specific functional groups to the error in the absolute hydration free energy of over 600 small molecules, predicted by alchemical free energy methods implemented in the CHARMM program. Our investigation reveals that molecules with (i) a nitro group in CGenFF and GAFF are, respectively, over- or undersolubilized in aqueous medium, (ii) amine groups are undersolubilized more so in CGenFF than in GAFF, and (iii) carboxyl groups are more oversolubilized in GAFF than in CGenFF. We present our analyses of the potential factors underlying these trends. We also showcase the use of a machine-learning-based approach combined with the SHapley Additive exPlanations framework to attribute these trends to specific functional groups, which can be easily adopted to explore the limits of other general force fields.
Read full abstract