Groundwater arsenic (As), contamination is a significant issue worldwide including China andPakistan, particularly in canal command areas. In this study, 131 groundwater samples were collected, and three machine learning models [Random Forest (RF), Logistic Regression (LR), and Artificial Neural Network (ANN)] were employed to predict As concentration. Descriptive statistics helped to conclude that all of the samples were inside the permitted limit of WHO for pH, Ca, Mg, Turbidity, Cl, K, Na, SO4,NO3,F and beyond limit of WHO for EC, HCO3, TDS, and As. RF suggested a median drop in Gini node impurity across all tree divisions. This predicted As contamination in samples due to presence of TDS, EC, HCO3- and turbidity in upper end of graph which expressed significance of these factors in contaminating water with Arsenic. Moreover, these factors were found positively correlated with Ar contamination. LR model expressed about best fitness of model. ANN classified large data set into two classes i.e. (1) Inside limit of WHO and (2) and outside limit of WHO. Total dissolved solids (TDS), turbidity, sodium (Na) and electrical conductivity (EC) were positively correlated with Ar (Arsenic concentration) in the collected samples. pH and K were negatively associated with Arsenic concentration of the observed samples. Confusion matrices and ROC-AUC scores evaluated that RF, model outperforming than LR, and ANN, in accuracy and sensitivity. Key variables influencing As concentration in the groundwater resources of the study area were identified, such parameters include TDS, chloride (Cl), bicarbonate (HCO3-) and turbidity. The study provided the complete profile of the 131 water samples which can be used to make strategies for the minimization of ground Water contamination for Rohri canal command area. Moreover, the steps can be taken to control the discussed parameters inside the WHO limit.
Read full abstract