The urgent need to eliminate Perfluorooctanoic Acid (PFOA) has positioned electrooxidation (EO) as a key solution for pollutant degradation. This study evaluates several machine learning (ML) models, including K-Nearest Neighbors (KNN), Decision Tree (DT), Random Forest (RF), Gradient Boosted Decision Trees (GBDT), and Deep Learning (DL), to predict EO efficiency in PFOA removal. Using 10-fold cross-validation, the RF model outperformed others with a root mean square error (RMSE) of 7.7 and a correlation coefficient of 0.965, demonstrating its robustness and accuracy across diverse operational settings. Feature importance within the RF model was analyzed using Gini impurity and Mean Decrease in Accuracy (MDA). Electrolysis Time consistently emerged as the most influential factor in both analyses, underscoring its pivotal role in providing extended exposure of PFOA molecules to reactive species at the electrode surfaces. The study also found strong agreement between Gini and MDA in identifying Current Density and Anode Material as critical factors, although MDA placed slightly more emphasis on Anode Material. Differences between Gini and MDA were more pronounced in the ranking of Electrolyte Type and Concentration, with MDA assigning higher importance to Electrolyte Concentration. In contrast, the Water Matrix was consistently ranked as the least important factor. The strong concordance between Gini and MDA highlights the reliability of the RF model in identifying key drivers of electrochemical degradation. Overall, this work contributes significantly to the advancement of pollutant degradation technologies, presenting a reliable ML-based tool for environmental remediation efforts.
Read full abstract