Electrokinetic remediation (EKR) presents a promising approach for polluted soil remediation, leveraging electric fields to mobilize contaminants and facilitate their removal. Accurate efficiency prediction is crucial for optimizing EKR processes, reducing costs, and minimizing environmental impact. However, the EKR efficiency is influenced by numerous complex factors, making it challenging to predict outcomes reliably. This study introduces a novel approach for developing interpretable machine learning (ML) models tailored for efficient and unbiased EKR efficiency prediction based on material properties and experimental conditions. A large experimental dataset comprising 185 tests was compiled, encompassing various EKR parameters, heavy metal concentration, soil characteristics. A structured ML workflow was devised, evaluating diverse ML algorithms and employing model-agnostic interpretation methodologies to uncover intrinsic correlations between removal efficiency and pertinent attributes. Among the investigated ML models, Extreme Gradient Boosting (XGB) exhibited the best performance. Bayesian optimization was applied to further enhance its capabilities. The optimized XGB model demonstrated superior predictive accuracy on the test set, with an RMSE of 0.255, an MAE of 0.158, and an R² of 0.82. Moreover, SHapley Additive exPlanations analysis was employed to interpret the contributions of the input variables. Among inputs, electrode distance emerged as the most significant factor influencing EKR efficiency, followed by area, electrolyte species, and remediation duration. This research not only advances the predictive capabilities for EKR efficiency but also provides crucial insights into the underlying factors influencing remediation success, paving the way for more efficient and scalable applications in environmental management.
Read full abstract