Abstract
This study applied reinforcement learning (RL) optimization to the simulation of a water resource recovery facility (WRRF) to evaluate the impact of reward function design under varying effluent requirements. Several mathematical structures were evaluated for the effluent quality index (EQI) portion of the reward function for the case of current treatment requirements. Of these, a fraction-based structure was found to produce the highest level of optimization, as well as the best mix of results along an optimal risk-reward tradeoff line. The study also found that the training success rate could be tuned by changing the weight given to the EQI. Given the simplicity of the current treatment requirements, agents trained for this case showed a very clear risk-reward tradeoff. The most cost-effective agent reduced operational costs by 10.9 % compared to current operation, equivalent to yearly savings of $267,000. RL agents were also evaluated for the case of future treatment requiring nutrient removal. As the future case was more complex than the current case, relative risk was evaluated using a combination of basic indicators such as maximum effluent value and instantaneous limit exceedance, correlation matrixes to uncover state-action relationships, and challenge testing.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.