The Rmel aquifer, located in the Tangier-Tetouan-Al Hoceima region of northwest Morocco, covers approximately 240 km2 and faces increasing pollution threats due to population growth and economic development. This study assesses aquifer vulnerability to pollution, and compares the performance of various machine learning models integrated with the DRASTIC-LU/LC method. The research used a dataset of 52 water samples analyzed for nitrate concentrations, considering eight factors influencing vulnerability: aquifer depth, net recharge, aquifer lithology, soil texture, topography, vadose zone impact, hydraulic conductivity, and land use. An information gain test was applied to evaluate the importance of these factors. Four machine learning algorithms were used with the DRASTIC-LU/LC method: multilayer perceptron (MLP), the bagging algorithm (BA), K-nearest neighbors (KNN), and extremely randomized trees (ERT). Model performance was assessed via the area under the ROC curve (ROC-AUC) to measure accuracy. The ERT model combined with DRASTIC-LU/LC achieved the highest accuracy (AUC = 0.929), followed by BA (AUC = 0.925), MLP (AUC = 0.852), and KNN (AUC = 0.787). In comparison, the original DRASTIC-LU/LC model had an AUC of 0.530. The results highlight significant vulnerability variation across the Rmel aquifer, with high to very high levels in the southern and northwestern regions, and moderate to low levels in the northeast and central areas. Vulnerability maps were validated by comparing the observed nitrate concentrations in the water samples, confirming model accuracy. Groundwater depth, net recharge, and hydraulic conductivity were identified as the most significant factors influencing vulnerability. This study demonstrates the effectiveness of integrating machine learning models with the DRASTIC-LU/LC method for accurate aquifer vulnerability assessment, offering valuable tools for public policy and groundwater management.
Read full abstract