Abstract

Reinforcement learning (RL) is a machine learning method that has recently seen significant research activity owing to its successes in the areas of robotics and gameplaying (Silver et al., 2017). However, significant challenges exist in the extension of these control methods to process control problems, where state and input signals are nearly always continuous and more stringent performance guarantees are required. The goal of this work is to explore ways that modern RL algorithms can be adapted to handle process control problems; avenues for this work include using RL with existing controllers such as model predictive control (MPC) and adapting cutting-edge actor-critic RL algorithms to find policies that meet the performance requirements of process control. Systems of special interest in this work come from energy production, particularly supercritical pulverized coal (SCPC) power production. This work also details the development of advanced models and control systems to solve specific problems in this setting. Studying the SCPC system, a plantwide model with sufficient detail to develop controllers that are effective over a wide operating range is needed. Starting with a custom unit model for the steam turbine, a plantwide flowsheet is synthesized is Aspen Plus Dynamics with due consideration for the modeling of the balance of the plant (including appropriate sizing for key unit items) and the regulatory control layer. A custom, high-fidelity boiler model, is also integrated into the model, allowing for detailed studies to be conducted. This model is validated against operating data from the plant from which equipment sizing data was available, with parameters estimated where necessary to achieve a good fit to the data. Using this model, advanced model predictive control strategies are investigated for boiler control under load changes. Using a high-fidelity model of the selective catalytic reduction (SCR) unit from the SCPC plant as a testbed the focus of the work shifts to RL-based controllers. The first of which is an RL approach for online tuning of an underling MPC acting to regulate the plant. Configured in this way, the joint controller works to regulate the plant (MPC) while improving performance (RL). An approximate state-action-reward-state-action (SARSA) algorithm is used to select prediction and control horizons for the MPC problem, and the controller is developed in a two-stage approach, wherein the RL agent is first trained offline on a reduced order model of the SCR to alleviate poor performance when controlling the true plant. This learning is

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call