Water quality prediction: a data-driven approach exploiting advanced machine learning algorithms with data augmentation

Karthick K,S Krishnan,R Manikandan

doi:10.2166/wcc.2023.403

Abstract

Abstract Water quality assessment plays a crucial role in various aspects, including human health, environmental impact, agricultural productivity, and industrial processes. Machine learning (ML) algorithms offer the ability to automate water quality evaluation and allow for effective and rapid assessment of parameters associated with water quality. This article proposes an ML-based classification model for water quality prediction. The model was tested with 14 ML algorithms and considers 20 features that represent various substances present in water samples and their concentrations. The dataset used in the study comprises 7,996 samples, and the model development involves several stages, including data preprocessing, Yeo–Johnson transformation for data normalization, principal component analysis (PCA) for feature selection, and the application of the synthetic minority over-sampling technique (SMOTE) to address class imbalance. Performance metrics, such as accuracy, precision, recall, and F1 score, are provided for each algorithm with and without SMOTE. LightGBM, XGBoost, CatBoost, and Random Forest were identified as the best-performing algorithms. XGBoost achieved the highest accuracy of 96.31% without SMOTE and had a precision of 0.933. The application of SMOTE enhanced the performance of CatBoost. These findings provide valuable insights for ML-based water quality assessment, aiding researchers and professionals in decision-making and management.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Water and Climate Change	Publication Date: Dec 20, 2023
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Water quality prediction: a data-driven approach exploiting advanced machine learning algorithms with data augmentation

Abstract

Talk to us

Similar Papers

More From: Journal of Water and Climate Change

Lead the way for us

Similar Papers

MR T1WI based radiomics and machine learning model for predicting the histopathological grades of soft tissue sarcomas
...
Chinese journal of radiology | VOL. 54
, et. al. ...
10 Apr 2020
Chinese journal of radiology | VOL. 54

Utilization of synthetic minority oversampling technique for improving potato yield prediction using remote sensing data and machine learning algorithms with small sample size of yield data
Hamid Ebrahimy ... Zhou Zhang
ISPRS Journal of Photogrammetry and Remote Sensing | VOL. 201
Hamid Ebrahimy, et. al.Hamid Ebrahimy ... Zhou Zhang
24 May 2023
ISPRS Journal of Photogrammetry and Remote Sensing | VOL. 201

River Water Quality Prediction and index classification using Machine Learning
Jitha P Nair ... M S Vijaya
Journal of Physics: Conference Series | VOL. 2325
Jitha P Nair, et. al.Jitha P Nair ... M S Vijaya
01 Aug 2022
Journal of Physics: Conference Series | VOL. 2325

Development and Validation of Machine Learning Algorithms to Predict 1-Year Ischemic Stroke and Bleeding Events in Patients with Atrial Fibrillation and Cancer
Bang Truong ... Jingjing Qian
Cardiovascular Toxicology | VOL. 24
Bang Truong, et. al.Bang Truong ... Jingjing Qian
18 Mar 2024
Cardiovascular Toxicology | VOL. 24

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Water quality prediction: a data-driven approach exploiting advanced machine learning algorithms with data augmentation

Abstract

Talk to us

Similar Papers

More From: Journal of Water and Climate Change