A significant challenge in applying machine learning to computational chemistry, particularly considering the growing complexity of contemporary machine learning models, is the scarcity of available experimental data. To address this issue, we introduce an approach that derives molecular features from an intricate neural network-based model and applies them to a simpler conventional machine learning model that is robust to overfitting. This method can be applied to predict various properties of a liquid system, including viscosity or surface tension, based on molecular features drawn from the ab initio calculated free energy of solvation. Furthermore, we propose a modified kernel model that includes Arrhenius temperature dependence to incorporate theoretical principles and diminish extreme nonlinearity in the model. The modified kernel model demonstrated significant improvements in certain scenarios and possible extensions to various theoretical concepts of molecular systems.
Read full abstract