Abstract

Four different machine learning algorithms, including Decision Tree (DT), Random Forest (RF), Multivariable Linear Regression (MLR), Support Vector Regressions (SVR), and Gaussian Process Regressions (GPR), were applied to predict the performance of a multi-media filter operating as a function of raw water quality and plant operating variables. The models were trained using data collected over a seven year period covering water quality and operating variables, including true colour, turbidity, plant flow, and chemical dose for chlorine, KMnO4, FeCl3, and Cationic Polymer (PolyDADMAC). The machine learning algorithms have shown that the best prediction is at a 1-day time lag between input variables and unit filter run volume (UFRV). Furthermore, the RF algorithm with grid search using the input metrics mentioned above with a 1-day time lag has provided the highest reliability in predicting UFRV with a RMSE and R2 of 31.58 and 0.98, respectively. Similarly, RF with grid search has shown the shortest training time, prediction accuracy, and forecasting events using a ROC-AUC curve analysis (AUC over 0.8) in extreme wet weather events. Therefore, Random Forest with grid search and a 1-day time lag is an effective and robust machine learning algorithm that can predict the filter performance to aid water treatment operators in their decision makings by providing real-time warning of the potential turbidity breakthrough from the filters.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.