Many wastewater utilities have discharge permits directly tied with the receiving river flow, so it is critical to have accurate prediction of the hydraulic throughput to ensure safe operation and environment protection. Current empirical knowledge-based operation faces many challenges, so in this study we developed and assessed daily-adaptive, probabilistic soft sensor prediction models to forecast the next month's average receiving river flowrate and guide the utility operations. By comparing 11 machine-learning methods, extra trees regression exhibits desired deterministic prediction accuracy at day 0 (overall accuracy index: 3.9 × 10−3 1/cms2) (cms: cubic meter per second), which also increases steadily over the course of the month (e.g., MAPE and RMSE decrease from 41.46% and 23.31 cms to 3.31% and 2.81 cms, respectively). The overall classification accuracy of three river flow classes reaches 0.79 at the beginning and increases to about 0.97 over the course of the predicted month. To manage the uncertainty caused by potential false negative classification as overestimations, a probabilistic assessment on the predictions based on 95% lower PI is developed and successfully reduces the false negative classification from 17% to nearly zero with a slight sacrifice of overall classification accuracy.
Read full abstract