Abstract

The development of accurate and interpretable models for predicting reaction constants of organic compounds with hydroxyl radicals is vital for advancing quantitative structure-activity relationships (QSAR) in pollutant degradation. Methods like molecular descriptors, molecular fingerprinting, and group contribution methods have limitations, as traditional machine learning struggles to capture all intramolecular information simultaneously. To address this, we established an integrated graph neural network (GNN) with approximately 12 million learnable parameters. GNN represents atoms as nodes and chemical bonds as edges, thus transforming molecules into a graph structures, effectively capturing microscopic properties while depicting atom connectivity in non-Euclidean space. Our datasets comprise 1401 pollutants to develop an integrated GNN model with Bayesian optimization, the model achieves root mean square errors of 0.165, 0.172, and 0.189 on the training, validation, and test datasets, respectively. Furthermore, we assess molecular structure similarity using molecular fingerprint to enhance the model's applicability. Afterwards, we propose a gradient weight mapping method for model explainability, uncovering the key functional groups in chemical reactions in artificial intelligence perspective, which would boost chemistry through artificial intelligence extreme arithmetic power.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call