The prediction of solubility is a complex and challenging physicochemical problem that has tremendous implications for the chemical and pharmaceutical industry. Recent advancements in machine learning methods have provided a great scope for predicting the reliable solubility of a large number of molecular systems. However, most of these methods rely on using physical properties obtained from experiments and expensive quantum chemical calculations. Here, we developed a method that utilizes a graphical representation of solute-solvent interactions using "MolMerger," which captures the strongest polar interactions between molecules using Gasteiger charges and creates a graph incorporating the true nature of the system. Using these graphs as input, a neural network learns the correlation between the structural properties of a molecule in the form of node embedding and its physicochemical properties as the output. This approach has been used to calculate molecular solubility by predicting the Log solubility values of various organic molecules and pharmaceuticals in diverse sets of solvents.
Read full abstract