Abstract
Quantitative structure–property relationships (QSPR) were developed using a genetic algorithm (GA)-based variable-selection approach with quantum chemical descriptors derived from AM1-based calculations (MOPAC7.0). With the QSPR models, the aqueous solubility of 71 aromatic sulfur-containing carboxylates, including phenylthio, and phenylsulfonyl carboxylates were efficiently estimated and predicted. Using GA-based multivariate linear regression (MLR) with cross-validation procedure, the most important descriptors were selected from a pool of 28 quantum chemical semi-empirical descriptors, including steric and electronic types, to build QSPR models. The molecular descriptors included molecular surface ( S A), charges on carboxyl group ( Q OC), the magnitude of the difference between E HOMO of the solute and E LUMO of water, divided by 100 ( E B), which were main factors affecting the aqueous solubility of the compounds of interest. The resulted coefficients R and R 2 of 0.9571 and 0.9161 and the prediction residual error sum of squares (PRESS) of 13.1768, revealed that it was accurate and reliable for the model to predict the aqueous solubility of the investigated organic compounds. If two outliers were omitted from the dataset, the resulted coefficients R=0.9619, R 2=0.9253, and PRESS=10.3875 were significantly improved. Compared with stepwise regression analysis, the results obtained in this work were better and more reasonable. The best QSPR model were obtained by GA-based MLR. Reasonable mechanisms for aqueous solubility of the sulfur-containing carboxylates were investigated and interpreted.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have