The feeding process of the ethylene cracking furnace necessitates the synchronized adjustment of multiple controlled factors. The process mainly relies on operators to do it manually, which is burdensome and may lead to significant variations in coil out temperature (COT) due to the differing expertise of operators. This paper proposes a method for learning the feeding strategy of the ethylene cracking furnace using offline reinforcement learning. The agent learns and optimizes the operating strategy directly from datasets, eliminating the need for sophisticated process simulator modeling. In addition, the advantage function is incorporated into the Twin Delayed Deep Deterministic Behavioral Cloning (TD3BC) algorithm, which enables the agent to acquire more effective operational experience. The proposed method is initially evaluated using benchmark datasets. Further, the proposed method is validated through comparative experiments on a feeding process validation model, demonstrating superior rewards and outperforming manual operating experience as well as other offline reinforcement learning methods.
Read full abstract