Understanding the interactions between solutes and solvents is vital in many areas of the chemical sciences. Solvation free energy (SFE) is an important thermodynamic property in characterising molecular solvation and so accurate prediction of this property is sought after. The One-Dimensional Reference Interaction Site Model (RISM) is a well-established method for modelling solvation, but it is known to yield large errors in the calculation of SFE. In this work, we show that a single machine learning free energy functional for RISM can accurately model solvation thermodynamics in multiple solvents. A convolutional neural network is trained on solvation free energy density functions calculated by RISM for small organic molecules in approximately 100 different solvent systems. We achieve an average RMSE of 1.41 kcal/mol and an R2 of 0.89 across all solvent systems. We also compare the performance for the most and least commonly represented solvents and show that higher accuracy is generally seen with higher volumes of data, with RMSE values of 0.69–1.29 kcal/mol and R2 values of 0.78–0.97 for solvents with more than 50 data points. We have shown that machine learning can greatly improve solvation free energy predictions in RISM, while demonstrating that the methodology is generalisable across solvent systems. This represents a significant step towards a universal machine learning SFE functional for RISM.
Read full abstract