The effect of model uncertainties in the reinforcement learning based regulation problem: An experimental case study with inverted pendulum

Amit Kumar Pal,Atta Oveisi,Tamara Nestorović

doi:10.1002/pamm.202300130

Abstract

AbstractThis work aims to improve the classical white‐box model of an inverted pendulum in order to reach a more accurate representation of an actual pendulum on a cart system. The purpose of the model is to train different controllers based on machine learning algorithms. In the context of this paper, the inverted pendulum system is driven by a belt drive that is controlled by a stepper motor. Due to the nature of the controller, the input to the stepper motor is in the form of a non‐smooth bang‐bang‐like signal that moves the cart to the left, right, or terminates its movement. One of the main challenges, in this case, is to find a proper function to model the stepper motor as its dynamics cannot be captured with a constant gain. It has been shown that the transient behavior of the stepper motor when changing direction or stopping is not negligible in the closed‐loop control performance. Accordingly, a grey‐box scheme, which accounts for the uncertainties that are not included in the vanilla white‐box model, is utilized to achieve a lower model mismatch compared to the actual pendulum. Initially, the equation of motion was derived using the Euler‐Lagrange equation with force on the cart as the control input. But in the real‐time experiment, the interface is realized by the stepper motor's frequency modulator, hence a transfer function representing the relationship between the frequency and the force applied on the cart (in the model) is calculated as a black‐box model. To improve the accuracy of the transfer function, an experimental data‐driven design of this function is performed based on modern schemes in system identification. For this purpose, the applied frequency to the stepper motor and the states from the actual system are recorded. Then, the applied force on the cart is calculated using the equation of motion and the recorded states. It is also shown that the frequency‐force transfer function uncertainty due to exogenous disturbances is non‐negligible and with the aim of having a more accurate model, an artificial neural network is introduced. Finally, the effectiveness of this grey‐box model is shown by training and implementing a deep Q‐network based controller to swing up and balance the inverted pendulum.

Full Text