Abstract

AbstractWithin developing countries, a multitude of problems that affect the water supply process can result in the contamination of water taps. While machine learning applications have become popular for attaining efficient water quality predictions, acquiring the necessary data for modelling for developing countries is challenging. This study constructs water quality prediction models by machine learning with a pseudo‐pipeline network to complement the missing data of the water supply process. Using both water source and water tap quality information measured by the Government of Nepal, we apply the three machine learning models: support vector machine (SVM), random forest (RF) and LightGBM. Furthermore, we also apply a traditional statistical method—logistic regression (LR)—to the prediction of the Escherichia coli (E. coli) contamination in water taps. With some input variables (such as the length from the nearest sources) obtained from the pseudo‐pipeline network, the results show that SVM has stable and high accuracy for both the 26 cities (70%) and for the 25 cities except for Kathmandu (79%). LR performed a significantly lower accuracy for all cities (61%) than for 25 cities (79%). Additionally, we show that our method can be applied to other regions where a water quality survey has not yet been conducted.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.