Abstract
This study introduces a methodology for modeling the groundwater artesian condition (AC) in a specific arid region of southern Iraq using five machine learning techniques; the stochastic gradient boosting (SGB), classification and regression trees (CART), random forest (RF), support vector machine (SVM), and k nearest neighbor (kNN). To this aim, an inventory map of flowing and non-flowing groundwater wells along with six explanatory factors were used. The chosen factors were the distance to faults (FDIS), faults density (FDEN), lineament density (LDEN), aquifer saturated thickness (ST), well depths (WD), and ground surface elevation. Examining the spatial pattern of flowing wells using the average nearest neighbor approach showed that the distribution of these wells is a cluster (average nearest ratio = 0.48). Testing the powerful factors in building the models using the Boruta package indicated that all factors play a role in controlling the emergence of AC. Five statistical error measures were employed to examine the predictive capability of the models used namely, Accuracy, kappa, receiver operating characteristic curve (ROC), sensitivity, and specificity. Applying the ML models in the R statistical package and its related caret package showed that the competent model was RF, followed by SGB and kNN. The worst performance models were CART and SVM. Accordingly, RF was used as the reference model for the probability mapping of AC in the study area, indicating the 29% (618 km̏) of the study area included in the predicted high AC and this zone mainly distribute in the eastern part and a small spot in the north. The AC probability map developed in the study could be used for drilling successful artesian flowing wells with minimal efforts and cost and thus, for efficient managing of this important system.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have