Abstract

Reactions of reactive halogen species (Cl•, Br•, and Cl2•–) with trace organic contaminants (TrOCs) have received much attention in recent years, and their k values are fundamental parameters for understanding their reaction mechanisms. However, k values are usually unknown. In this study, we developed machine learning (ML)-based quantitative structure-activity relationship (QSAR) models to predict k values. We tested five algorithms, namely, random forest, neural network, XGBoost, support vector machine (SVM), and multilinear regression, using molecular descriptors (MDs) and molecular fingerprints (MFs) as inputs. The optimal algorithms were MD-XGBoost for Cl• and Br•, and MF-SVM for Cl2•–, respectively, with R2test values of 0.876, 0.743, and 0.853. We found that electron-withdrawing/donating groups tended to interfere with the reactivity of Cl2•– more than Cl• and Br•. This explains why MFs are better inputs for predictive models of Cl2•–, whereas MDs are more suitable for Cl• and Br•. Furthermore, we interpreted the models using SHAP analysis, and the results indicated that our models accurately predicted k values both statistically and mechanistically. Our models provide useful tools for obtaining unknown k values and help researchers understand the inherent relationships between the models.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call