Assessing groundwater quality typically involves labor-intensive, time-consuming, and costly laboratory tests, making real-time monitoring impractical, especially at the local level. Groundwater quality projections at the local scale using broad spatial datasets have been inaccurate due to variations in hydrogeology, human activities, industrial operations, groundwater extraction, and waste disposal. This study aims to identify the most dependable and resilient machine learning algorithms for forecasting groundwater quality at nearby monitoring locations by utilizing simple water quality metrics that can be quickly assessed without extensive sampling and laboratory testing. The Entropy-weighted Water Quality Index (EWQI) was calculated using a large spatial and temporal dataset (2014–2021) of 977 wells with parameters including pH, total hardness (TH), calcium (Ca2⁺), magnesium (Mg2⁺), sodium (Na⁺), potassium (K⁺), sulfate (SO₄2⁻), chloride (Cl⁻), nitrate (NO₃⁻), total dissolved solids (TDS), and fluoride (F⁻). Further, similar parameters were also observed in 33 open wells at the three local monitoring sites from December 2022 to March 2023. The EWQI was predicted using a Random Forest (RF), eXtreme Gradient Boosting (XGB), and Deep Neural Network (DNN). The pH, TH, and TDS were used as input variables for EWQI predictions, as they can be easily measured using handheld probes or multi-parameters. The model performance was evaluated using R2, MAE, and RMSE. During the training stage, all three models predicted the EWQI with an R2 greater than 90%, with minimal errors when pH, TH, and TDS were input variables. To validate the models at a local scale, the EWQI was predicted at the village level (e.g., Antoli, Balapura, and Lapodiaya) using pH, TH, and TDS as input variables. The machine learning models were able to predict the EWQI very closely to the actual EWQI, with an R2 greater than 90%. It is also evident that the models could predict the EWQI using basic parameters that are easily measured, providing an overall idea of the water quality for a small area. Hence, these machine learning models could be useful for accurately representing groundwater quality, thereby avoiding the use of time-consuming and costly laboratory techniques.
Read full abstract