An emergency water pollution incident poses a significant risk to the proper functioning of wastewater treatment plants, particularly in domestic-industrial integrated facilities. Source tracing is recognized as an effective method to mitigate ongoing impacts. Machine learning-assisted traceability is emerging as a more efficient and faster method compared to traditional methods. In this study, a total of 712 sets of characterization wastewater information from effluent samples from14 discharge enterprises across 6 different sectors, as well as domestic wastewater was collected using 3-dimensional fluorescence spectroscopy. After data cleaning and augmentation, a feature fingerprint database of wastewater was constructed to train a traceability model. Several machine learning algorithms, including Back Propagation neural network (BP), Random Forest (RF), Support Vector Machine (SVM), Naive Bayes (NB) and K-Nearest Neighbors (KNN), were selected for constructing the traceability framework. Subsequently, an advanced Particle Swarm Optimization Random Forest model (PSO-RF), capable of automatically optimizing model parameters, was proposed and applied to trace the sources of wastewater in integrated wastewater treatment plant. The PSO-RF achieved and accuracy of 96.55 % in sector identification and 94.25 % in manufacturer identification. As part of the validation process, laboratory simulations were conducted using blended wastewater with different volume ratios of domestic and industrial wastewater to evaluated the potential application of PSO-RF. The results consistently demonstrated PSO-RF's effectiveness, particularly in tracing pharmaceutical wastewater sources, maintaining an accuracy of over 85 %. This work presents a novel strategy for tracing abnormal sources during emergency pollutant incidents, providing essential support for integrating artificial intelligence (AI) into meticulous wastewater management.
Read full abstract