Reinforcement Learning (RL) adapts to a dynamic environment through exploration and continual learning. However, the risks associated with exploration have limited RLC applications in industrial settings. This research designed a safe RLC framework for the bitumen extraction process from oil sands. We explore transfer learning techniques through Behavioral Cloning (BC), Generative Adversarial Imitation Learning (GAIL), and Simulation-to-Reality (Sim2Real) pretraining, which allow the RL agent to acquire fundamental knowledge before real-world exploration. This transferred knowledge steers exploration towards near-optimal and safer regions. GAIL and Sim2Real pretrained agents deliver satisfactory control performance right after pretraining, cutting process trips during online training by factors of 8 and 27, respectively. Further online training reduces Integral Squared Error (ISE) for interface tracking by 71% with GAIL and 22% with Sim2Real. ISE for tailings density tracking also decreases by 73% and 33%, respectively. This significant reduction demonstrates RL's continual learning ability suitable for autonomous control. This work highlights the potential for safely deploying RL in process automation and reveals that RLC can handle complex control problems in multivariate, multimodal, partially observable, and uncertain environments. RLC achieves Model Predictive Control (MPC)-level performance but with less controller effort, and computation time faster by a factor of 10.
Read full abstract