Abstract

The aim of the present work is to develop quantitative structure–property relationship (QSPR) models for adsorption capability of a large dataset of chemicals (n=3483) on to activated carbon. Two different splitting techniques like k-means clustering and principal component analysis (PCA) combined with duplex method were used to divide the data set into training and test sets. Attempt was made to find out the common descriptors present in various models indicating their importance for adsorption capacity on to activated carbon. In spite of presence of large number of compounds in the training and test sets (3:1 in size ratio), we did not omit any compounds showing outlier behavior to artificially show enhanced values of validation metrics thus ensuring the predictive quality of the models for diverse types of compounds. The models were developed to study the predictive ability of extended topochemical atom (ETA) parameters which are calculated from two-dimensional representation of molecules and introduced by the present group of authors. The ETA models were compared to non-ETA models involving topological, spatial and structural descriptors. In all the cases, the data set was first subjected to stepwise regression to find out the contributing variables, and the selected variables were further subjected to partial least squares (PLS) regression. The PLS models indicate that ETA descriptors provide better external validation characteristics in terms of predictive R2 than that of the non-ETA ones. The best ETA model shows encouraging statistical quality (Qint2=0.8059, Qext(F1)2=0.7914, Qext(F2)2=0.7909, Qext(F3)2=0.8492).

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.