Abstract
Polydimethylsiloxane-air partition coefficient (KPDMS-air) is a key parameter for passive sampling to measure POPs concentrations. In this study, 13 QSPR models were developed to predict KPDMS-air, with two descriptor selection methods (MLR and RF) and seven algorithms (MLR, LASSO, ANN, SVM, kNN, RF and GBDT). All models were based on a data set of 244 POPs from 13 different categories. The diverse model evaluation parameters calculated from training and test set were used for internal and external verification. Notably, the Radj2, QBOOT2 and Qext2 are 0.995, 0.980 and 0.951 respectively for GBDT model, showing remarkable superiority in fitting, robustness and predictability compared with other models. The discovery that molecular size, branches and types of the bonds were the main internal factors affecting the partition process was revealed by mechanism explanation. Different from the existing QSPR models based on single category compounds, the models developed herein considered multiple classes compounds, so that its application domain was more comprehensive. Therefore, the obtained models can fill the data gap of missing experimental KPDMS-air values for compounds in the application range, and help researchers better understand the distribution behavior of POPs from the perspective of molecular structure.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have