Abstract

Accurate water quality predicting has an essential role in improving water management and pollution control. The machine learning models have been successfully implemented for modelling total dissolved solids (TDS), sodium absorption ratio (SAR) and total hardness (TH) content in aquatic ecosystems with insufficient data. However, due to multiple pollution sources and complex behaviours of pollutants, these models' effect in predicting TDS, SAR, and TH levels in the Karun River system is still unclear. Given this problem, multiple linear regression (MLR), M5P model tree, support vector regression (SVR) and random forest regression (RFR) models were used to predict TDS, SAR and TH variables in the four stations in the Karun River for 1999-2019 period. Initially, to reduce the number of input variables, the principal component analysis (PCA) technique was used. The developed models are valued in terms of the coefficient of determination (R2) and the root mean square error (RMSE). Base on the PCA, it was found that sodium (Na), chloride (Cl) and TH and Na and Cl are the most influential inputs on TDS and SAR, respectively, while calcium (Ca) and magnesium (Mg) are the most effective on TH. The results indicated that RFR, SVR and MLR models had the lowest error in predicting TDS, SAR and TH, respectively, in all stations. RFR model had the highest performance for predicting TDS (R2= 0.98, RMSE= 70.50 mg l-1), SVR model for predicting SAR (R2= 0.99, RMSE= 0.04) and MLR model for predicting TH (R2= 0.99, RMSE= 1.54 mg l-1) in Darkhovin station. The comparison of the results indicated that the machine learning models could satisfactorily estimate the TDS, SAR and TH for all stations.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call