Groundwater quality is assessed by conducting water sampling and laboratory analysis. Field-based measurements are costly and time-consuming. This study introduces a machine learning (ML)-based framework and innovative application of stacking ensemble learning model, for predicting groundwater quality in an unconfined aquifer located in northern Iran. The groundwater quality index (GWQI) from 250 wells was evaluated and classified. We considered various influential factors such as proximity to residential areas, evaporation, aquifer transmissivity, precipitation values, population density, distance to industrial centers, distance to water resources, and topography. Three different ML classifiers were employed to establish relationships between GWQI and the aforementioned factors: the AdaBoost classifier (ADA), quadratic discriminant analysis (QDA), and stacking ensemble learning (SEL). A novel model was introduced dubbed quadratic-ada-stacking ensemble learning (QA-SEL) to predict GWQI. The performance of these algorithms was evaluated through the receiver-operating characteristic (ROC) and multiple statistical efficiency indicators, including overall accuracy, precision, recall, and the F-1 score. All three ML algorithms displayed a high degree of accuracy in their GWQI predictions. Nonetheless, the QA-SEL method was identified as the most effective model due to its superior accuracy (overall accuracy, precision, recall = 0.95, 0.95, 0.96, ROC = 0.96, respectively). Following model optimization and testing, the QA-SEL model and a GIS were employed to map GWQI classes across the entire area. The produced GWQI map was validated by comparing the measured and predicted GWQI on the map. This study offers an economically efficient model for groundwater quality prediction, which can be replicated in other plains.
Read full abstract