Abstract

In this communication, the quantitative structure–property relationship (QSPR) strategy is applied to estimate the refractive indices of pure organic chemical compounds. In order to propose a comprehensive, reliable, and predictive model, a large dataset of 11,918 pure organic compounds was exploited in the development of the model. The sequential search mathematical strategy coupled with the genetic function approximation method has been observed to be the only viable technique capable of selection of the proper model parameters (molecular descriptors) which are then used in the correlation of the refractive indices. In order to allocate data to the training, validation, and test sets, the K-means clustering technique was applied. The leverage approach is used to check whether the newly developed model is statistically correct and valid. In the leverage approach, the statistical hat matrix, Williams plot, and the residuals of the model results assist in the identification of the probable data outliers. Finally, an analysis was performed to determine the validity and accuracy of the model for various atomic elements contained in the molecules, i.e., an elemental analysis with regard to the model performance. Using the dedicated strategy, satisfactory results were obtained and are quantified by the following statistical parameters: average absolute relative deviation of the predicted properties from existing literature values: 0.9%, and squared correlation coefficient: 0.892.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call