Assessing the Suitability of Boosting Machine-Learning Algorithms for Classifying Arsenic-Contaminated Waters: A Novel Model-Explainable Approach Using SHapley Additive exPlanations

Bemah Ibrahim,Anthony Ewusi,Isaac Ahenkorah

doi:10.3390/w14213509

Bemah Ibrahim, Anthony Ewusi + Show 1 more

Open Access

https://doi.org/10.3390/w14213509

Copy DOI

Abstract

There is growing tension between high-performance machine-learning (ML) models and explainability within the scientific community. In arsenic modelling, understanding why ML models make certain predictions, for instance, “high arsenic” instead of “low arsenic”, is as important as the prediction accuracy. In response, this study aims to explain model predictions by assessing the relationship between influencing input variables, i.e., pH, turbidity (Turb), total dissolved solids (TDS), and electrical conductivity (Cond), on arsenic mobility. The two main objectives of this study are to: (i) classify arsenic concentrations in multiple water sources using novel boosting algorithms such as natural gradient boosting (NGB), categorical boosting (CATB), and adaptive boosting (ADAB) and compare them with other existing representative boosting algorithms, and (ii) introduce a novel SHapley Additive exPlanation (SHAP) approach for interpreting the performance of ML models. The outcome of this study indicates that the newly introduced boosting algorithms produced efficient performances, which are comparable to the state-of-the-art boosting algorithms and a benchmark random forest model. Interestingly, the extreme gradient boosting (XGB) proved superior over the remaining models in terms of overall and single-class performance metrics measures. Global and local interpretation (using SHAP with XGB) revealed that high pH water is highly correlated with high arsenic water and vice versa. In general, high pH, high Cond, and high TDS were found to be the potential indicators of high arsenic water sources. Conversely, low pH, low Cond, and low TDS were the main indicators of low arsenic water sources. This study provides new insights into the use of ML and explainable methods for arsenic modelling.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Water	Publication Date: Nov 2, 2022
Citations: 7	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Assessing the Suitability of Boosting Machine-Learning Algorithms for Classifying Arsenic-Contaminated Waters: A Novel Model-Explainable Approach Using SHapley Additive exPlanations

Abstract

Talk to us

Similar Papers

More From: Water

Lead the way for us

Similar Papers

Explainable Mortality Prediction Model for Congestive Heart Failure with Nature-Based Feature Selection Method
Nusrat Tasnim ... Mohammad Shahidul Shahidul Islam
Applied Sciences | VOL. 13
Nusrat Tasnim, et. al.Nusrat Tasnim ... Mohammad Shahidul Shahidul Islam
17 May 2023
Applied Sciences | VOL. 13

Explainable machine learning models for predicting the axial compression capacity of concrete filled steel tubular columns
Celal Cakiroglu ... Sujith Mangalathu
Construction and Building Materials | VOL. 356
Celal Cakiroglu, et. al.Celal Cakiroglu ... Sujith Mangalathu
02 Oct 2022
Construction and Building Materials | VOL. 356

Investigation of optimized machine learning models with PSO for forecasting the shear capacity of steel fiber-reinforced SCC beams with/out stirrups
Faruk Ergen ... Metin Katlav
Journal of Building Engineering | VOL. 83
Faruk Ergen, et. al.Faruk Ergen ... Metin Katlav
03 Jan 2024
Journal of Building Engineering | VOL. 83

Abstract 9715: Feasibility of Using Machine Learning to Establish a Risk Prediction Model for Sudden Cardiac Death in Patients with Hypertrophic Cardiomyopathy
Jie Liu ... Jizheng Wang
Circulation | VOL. 144
Jie Liu, et. al.Jie Liu ... Jizheng Wang
16 Nov 2021
Circulation | VOL. 144

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Assessing the Suitability of Boosting Machine-Learning Algorithms for Classifying Arsenic-Contaminated Waters: A Novel Model-Explainable Approach Using SHapley Additive exPlanations

Abstract

Talk to us

Similar Papers

More From: Water