Abstract

Optimal control problems are one of the most challenging problems in optimization. This paper presents a new and efficient Reinforcement Learning approach to optimal control problems based on the Batch Q-learning algorithm. To improve the convergence of the RL algorithm, we use k-dimensional uniformity of advanced sampling procedures, namely employing Hamersley sequences (HSS). HSS is used to randomly sample the state variables and discrete controls from the action space for the RL optimal control problem. The Neural-fitted Q-iterative algorithm is applied to solve an optimal control problem for a first-order state dynamical system. A real-world application of optimal temperature profile determination for biodiesel production in a batch reactor is presented. We present the comparison of our HSS-RL algorithm with that of the maximum principle.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call