The most prominent source of drinking water is groundwater, followed by lakes and reservoirs. Hydrological parameters like temperature, dissolved oxygen, pH, conductivity, ORP, and turbidity often change due to waste dumping into natural drinking water sources, particularly in densely populated areas. As a result, the water quality must be tested before public consumption to ensure healthy living in society. This research collected water samples from 129 wells in the Kanchipuram district in Tamil Nadu, India. An efficient integrated machine-learning-based prediction model has been proposed and modeled to determine the groundwater quality index (GQI). Several machine learning models were used to predict the water’s quality, including the naïve Bayes model, the KNN classifier, and the XGBoost classifier. Water quality predictions in 2024 were made using a combination of classification algorithms and models based on long short-term memory (LSTM) neural networks. The projected water quality characteristics were analyzed using geographical information system (GIS) technology to better understand and visualize the results. The XGBoost classifier model outperforms prior findings in the literature, with an accuracy of roughly 94.6%. The classification and prediction model was validated using collected and tested current data samples from a selected well. The findings were accurate within the 5% error range, promoting sustainability.
Read full abstract