Groundwater quality assessment using machine learning models: a comprehensive study on the industrial corridor of a semi-arid region.

Loganathan Krishnamoorthy,Vignesh Rajkumar Lakshmanan

doi:10.1007/s11356-024-34119-7

Abstract

Water plays a significant role in sustaining the lives of humans and other living organisms. Groundwater quality analysis has become inevitable, because of increased contamination of water resources and global warming. This study used machine learning (ML) models to predict the water quality index (WQI) and water quality classification (WQC). Forty groundwater samples were collected near the Ranipet industrial corridor, and the hydrogeochemistry and heavy metal contamination were analyzed. WQC prediction employed random forest (RF), gradient boosting (GB), decision tree (DT), and K-nearest neighbor (KNN) models, and WQI prediction used extreme gradient boosting (XGBoost), support vector regressor (SVR), RF, and multi-layer perceptron (MLP) models. The grid search method is used to evaluate the ML model by F1 score, accuracy, recall, precision, and Matthews correlation coefficient (MCC) for WQC and the coefficient of determination (R2), mean absolute error (MAE), mean square error (MSE), and median absolute percentage error (MAPE) for WQI. The WQI results indicate that the groundwater quality of the study area is very poor and unsuitable for drinking or irrigation purposes. The performance metrics of the RF model excelled in predicting both WQC (accuracy = 97%) and WQI (R2 = 91.0%), outperforming other models and emphasizing ML's superiority in groundwater quality assessment. The findings suggest that ML models perform well and yield better accuracy than conventional techniques used in groundwater quality assessment studies.

Full Text