Methods and Applications of Deep Reinforcement Learning for Chemical Processes

Christian D Hubbs

doi:10.1184/r1/14718714.v1

Abstract

Planning and scheduling are critical operational roles to any manufacturing business. In most companies in the chemical industry, these roles are handled by human plannersand schedulers who have to make complex decisions under uncertainty to balance inventory costs, production costs, and customer service levels while respecting the constraints of the manufacturing facility or equipment that they are working with. Balancing these trade-offs is a difficult task, particularly given the uncertainty around customer demand, pricing, and equipment reliability. In this thesis, we approach scheduling problems using reinforcement learning to learn a policy for generating schedules that meet the stated business criteria.In the first part of this thesis, we examine reinforcement learning in the context of scheduling a single-stage, continuous reactor under uncertainty. We develop asimulation of the plant based on historical data and train an agent to schedule it. Additionally, we provide a series of benchmarks for this agent versus deterministic MILP models, a stochastic programming model, and a perfect information model to better gauge the efficacy of the reinforcement learning approach. We find that thereinforcement learning model performs very favorably in this scenario and can quickly produce schedules to react to changes in the plant in real-time. This single-stage model is then extended to a more general, multi-stage model basedon the same physical system. The multi-stage model is more difficult to train with a single policy as the agent has to learn a policy for both the continuous reactor and a bagging line. We explore a number of strategies for addressing this problem and show results pertaining to each. We have also implemented a reinforcement learning model in the actual plant we have been exploring. We developed a series of techniques to monitor and evaluate the system in production being very cognizant of the non-stationary nature of the pricing and demand patterns we have trained the agent on. Other modeling features and considerations related to implementing a reinforcement learning model in practice areaddressed. In the second part, we turn to a series of broader operations research and supply chain problems. Here, we show results from OR-Gym, an open-source Python library of classic operations research problems for reinforcement learning. The OR-Gym environments were explored to show the efficacy of reinforcement learning in comparison tomany classic, optimization approaches and to determine when and where practitioners may benefit from utilizing reinforcement learning. Finally, we include a section whereby we extend an inventory management problem developed in the OR-Gym library to a larger, supply network and show how reinforcement learning compares with a variety of optimization methods under uncertainty.

Full Text