The growing demand and exigency for groundwater resources warrant the demarcation of groundwater spring potential zones (GSPZ) for effective sustainable strategy in groundwater identification, conservation, and management. Here we utilized a novel data mining (DM) ensemble to generate groundwater spring potential maps (GSPMs) by combining RF-BRT (random forest-boosted regression tree), MARS-SVM (multivariate adaptive regression spline-support vector machine) and, FDA-GLM-MDA (functional data analysis-generalized linear model-mixture discriminant analysis). Initially, an aggregate of 1726 groundwater spring locations was collected from the regional water company of Tehran Province and field investigation, in which 1208 springs (70%) were taken for training purposes and the remaining 518 (30%) springs were applied for the validation process. Twelve conditioning factors including DEM (digital elevation model)/elevation, fault density, aspect, rainfall, distance from rivers, distance from faults, slope, MRVBF (multiresolution index of valley bottom flatness), TWI (topographic wetness index), lithology, land use/land cover, and permeability were utilized for mapping process and their importance in predicting the groundwater spring potential. The variable importance (VI) analysis using SVM (support vector machine) reveals that the most significant conditioning factors in the prediction process are rainfall, TWI, DEM-elevation, distance from rivers, slope, distance from faults, and MRVBF. The GSPMs generated from novel data-mining (DM) ensembles were validated using the cut-off reliant (recall, fallout, F-measure, accuracy, precision, specificity, TSS: true skill statistic, Cohen’s kappa, fourfold plot, CCI: corrected classified instances) and cut-off independent (ROC-AUC: receiver operating characteristic-area under the curve) measures. The outcome of the validation measures shows that RF-BRT has the superior values of recall, F-measure, overall accuracy, precision, specificity, TSS, Cohen’s kappa, fourfold plot, CCI followed by MARS-SVM, and FDA-GLM-MDA whereas the AUC value of RF-BRT (0.955), MARS-SVM (0.934), and FDA-GLM-MDA (0.914) also display similar result. The GSPMs generated using the novel DM ensemble models in our study can be utilized by policymakers in implementing the strategies for effective land use planning and sustainable groundwater management.
Read full abstract