Abstract

Batch-to-batch (B2B) or run-to-run (R2R) optimization refers to the strategy of updating the operating parameters of a batch run based on the results of previous runs and exploits the repetitive nature of batch process operation. Although B2B optimization uses feedback from previous batch runs to learn about model uncertainty and improve the operation of future runs, the standard techniques have the limitations of passive learning and being myopic in making adjustments. This work proposes a novel way to use the reinforcement learning approach to embed the active learning feature into B2B optimization. For this, the B2B optimization problem is formulated as a maximization of a long-term performance of repeated batch runs, which are modeled as a stochastic process with uncertain parameters. To solve the resulting Bayes-Adaptive Markov decision process (BAMDP) problem in a near-optimal manner, a policy gradient reinforcement learning algorithm is employed. Through case studies, the behavior and effectiveness of the proposed B2B optimization method are examined by comparing it with the traditional certainty equivalence based B2B optimization method with passive learning.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.