The rise of deep reinforcement learning in recent years has led to its usage in solving various challenging problems, such as chess and Go games. However, despite its recent success in solving highly complex problems, a question arises on whether this class of method is best employed to solve control problems in general, such as driverless cars, mobile robot control, or industrial manipulator control. This paper presents a comparative study between various classes of control algorithms and reinforcement learning in controlling an inverted pendulum system to evaluate the performance of reinforcement learning in a control problem. A test was performed to test the performance of root locus-based control, state compensator control, proportional-derivative (PD) control, and a reinforcement learning method, namely the proximal policy optimization (PPO), to control an inverted pendulum on a cart. The performances of the transient responses (such as overshoot, peak time, and settling time) and the steady-state responses (namely steady-state error and the total energy) were compared. It is found that when given a sufficient amount of training, the reinforcement learning algorithm was able to produce a comparable solution to its control algorithm counterparts despite not knowing anything about the system’s properties. Therefore, it is best used to control plants with little to no information regarding the model where testing a particular policy is easy and safe. It is also recommended for a system with a clear objective function.
Read full abstract