Abstract

With the aim of solubility estimation in water, polyethylene glycol 400 (PEG) and their binary mixtures, quantitative structure–property relationships (QSPRs) were investigated to relate the solubility of a large number of compounds to the descriptors of the molecular structures. The relationships were quantified using linear regression analysis (with descriptors selected by stepwise regression) and formal inference-based recursive modeling (FIRM). The models were compared in terms of the solubility prediction accuracy for the validation set. The resulting regression and FIRM models employed a diverse set of molecular descriptors explaining crystal lattice energy, molecular size, and solute–solvent interactions. Significance of molecular shape in compound's solubility was evident from several shape descriptors being selected by FIRM and stepwise regression analysis. Some of these influential structural features, e.g. connectivity indexes and Balaban topological index, were found to be related to the crystal lattice energy. The results showed that regression models outperformed most FIRM models and produced higher prediction accuracy. However, the most accurate estimation was achieved by the use of a combination of FIRM and regression models. The results also showed that the use of melting point in regression models improves the estimation accuracy especially for solubility in higher concentrations of PEG. Aqueous or PEG/water solubilities can be estimated by these models with root mean square error of below 0.70.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call