Source and raw water quality may deteriorate due to rainfall and river flow events that occur in watersheds. The effects on raw water quality are normally detected in drinking water treatment plants (DWTPs) with a time-lag after these events in the watersheds. Early warning systems (EWSs) in DWTPs require models with high accuracy in order to anticipate changes in raw water quality parameters. Ensemble machine learning (EML) techniques have recently been used for water quality modeling to improve accuracy and decrease variance in the outcomes. We used three decision-tree-based EML models (random forest [RF], gradient boosting [GB], and eXtreme Gradient Boosting [XGB]) to predict two critical parameters for DWTPs, raw water Turbidity and UV absorbance (UV254), using rainfall and river flow time series as predictors. When modeling raw water turbidity, the three EML models (rRF−Tu2=0.87, rGB−Tu2=0.80 and rXGB−Tu2=0.81) showed very good performance metrics. For raw water UV254, the three models (rRF−UV2=0.89, rGB−UV2=0.85 and rXGB−UV2=0.88) again showed very good performance metrics. Results from this study suggest that EML approaches could be used in EWSs to anticipate changes in the quality parameters of raw water and enhance decision-making in DWTPs.
Read full abstract