Comparing Reinforcement Learning Methods for Real-Time Optimization of a Chemical Process

Titus Quah,Derek Machalek,Kody M Powell

doi:10.3390/pr8111497

Abstract

One popular method for optimizing systems, referred to as ANN-PSO, uses an artificial neural network (ANN) to approximate the system and an optimization method like particle swarm optimization (PSO) to select inputs. However, with reinforcement learning developments, it is important to compare ANN-PSO to newer algorithms, like Proximal Policy Optimization (PPO). To investigate ANN-PSO’s and PPO’s performance and applicability, we compare their methodologies, apply them on steady-state economic optimization of a chemical process, and compare their results to a conventional first principles modeling with nonlinear programming (FP-NLP). Our results show that ANN-PSO and PPO achieve profits nearly as high as FP-NLP, but PPO achieves slightly higher profits compared to ANN-PSO. We also find PPO has the fastest computational times, 10 and 10,000 times faster than FP-NLP and ANN-PSO, respectively. However, PPO requires more training data than ANN-PSO to converge to an optimal policy. This case study suggests PPO has better performance as it achieves higher profits and faster online computational times. ANN-PSO shows better applicability with its capability to train on historical operational data and higher training efficiency.

Highlights

IntroductionMachine learning has shown success in optimizing complex systems such as scheduling electricity prices to manage demand and maximize power grid performance [4,5,6]
To investigate if the widely used artificial neural network (ANN)-particle swarm optimization (PSO) can be replaced by newer actor–critic methods, this paper presents a novel comparison between two algorithms—Artificial Neural Network with Particle Swarm Optimization (ANN-PSO) and Policy Optimization (PPO)—by comparing the methods of the algorithms and evaluating these algorithms on a case study of a stochastic steady-state chemical optimization problem
An optimization problem is considered where an agent interacts with an environment which is assumed to be fully observable. This problem can be formulated as a Markov Decision Process (MDP) where the environment is described by a set of possible states S ∈ Rn, possible actions A ∈ Rm, a distribution of initial states p(s0 ), a reward distribution function R(st, at ) given state st and action at, a transitional probability p(st+1 |st, at ), and a future reward discount factor γ

Summary

Introduction

Machine learning has shown success in optimizing complex systems such as scheduling electricity prices to manage demand and maximize power grid performance [4,5,6] This motivates exploration of other machine learning techniques like reinforcement learning (RL) on model-free optimization [7]. RL research has seen many breakthroughs in recent years with new algorithms capable of defeating most humans in difficult games [8,9,10,11]. These algorithms are not designed to play games, but to learn and accomplish general tasks. A real-world example can be seen in OpenAI’s algorithm which learned how to control a robotic hand to solve a Rubick’s cube under disturbances [12]

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Processes	Publication Date: Nov 19, 2020
Citations: 11	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Comparing Reinforcement Learning Methods for Real-Time Optimization of a Chemical Process

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Processes

Lead the way for us

Similar Papers

Next-gen resource optimization in NB-IoT networks: Harnessing soft actor–critic reinforcement learning
S Anbazhagan ... R.K Mugelan
Computer Networks | VOL. 252
S Anbazhagan, et. al.S Anbazhagan ... R.K Mugelan
01 Jul 2024
Computer Networks | VOL. 252

Evaluating the Efficacy of Different Neural Network Deep Reinforcement Algorithms in Complex Search-and-Retrieve Virtual Simulations
Ishita Vohra ... Varun Dutt
-
Ishita Vohra, et. al.Ishita Vohra ... Varun Dutt
01 Jan 2021
01 Jan 2021

Parallel Machine Workshop Scheduling Using the Integration of Proximal Policy Optimization Training and Monte Carlo Tree Search
Jia Hai Wang ... Bo Wen Zhang
-
Jia Hai Wang, et. al.Jia Hai Wang ... Bo Wen Zhang
06 Nov 2020
06 Nov 2020

Deep reinforcement learning based optimization for a tightly coupled nuclear renewable integrated energy system
Zonggen Yi ... Raghav Khanna
Applied Energy | VOL. 328
Zonggen Yi, et. al.Zonggen Yi ... Raghav Khanna
01 Nov 2022
Applied Energy | VOL. 328

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Comparing Reinforcement Learning Methods for Real-Time Optimization of a Chemical Process

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Processes