Abstract

Most of the groundwater vulnerability assessment methods using machine learning are binary classification. This study attempts multi-class classification models to map the groundwater vulnerability against Nitrate contamination. Further, the significance of the number of classes used in the multi-class classification is studied by considering three and five classes. Three machine learning models, namely Random Forest, Extreme Gradient Boosting and CART, with two classification schemes, were developed for the present study. The parameters used in the conventional DRASTIC method and with an additional parameter, Landuse, have been employed for the study. Evaluation metrics such as Accuracy, Kappa, Positive Predictive Value, Negative Predictive Value, and Area Under the Curve of the Receiver Operating Characteristic (AUC-ROC) were compared among all six models to select the optimal one. Based on the model evaluation metrics and consistent distribution of area among the classes Random Forest model with a three-class classification with an AUC of 0.95 is considered optimum for the selected objective. This study highlights the importance of the data classification process and the selection of the number of classes for ML model prediction in assessing groundwater vulnerability. Leveraging the effectiveness of the Geographic Information system and advanced machine learning techniques, the proposed approach offers valuable insights for enhanced groundwater management and contamination mitigation strategies.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call