On the Impact of Gravity Compensation on Reinforcement Learning in Goal-Reaching Tasks for Robotic Manipulators

Jonathan Fugal,Hasan A Poonawala,Jihye Bae

doi:10.3390/robotics10010046

Abstract

Advances in machine learning technologies in recent years have facilitated developments in autonomous robotic systems. Designing these autonomous systems typically requires manually specified models of the robotic system and world when using classical control-based strategies, or time consuming and computationally expensive data-driven training when using learning-based strategies. Combination of classical control and learning-based strategies may mitigate both requirements. However, the performance of the combined control system is not obvious given that there are two separate controllers. This paper focuses on one such combination, which uses gravity-compensation together with reinforcement learning (RL). We present a study of the effects of gravity compensation on the performance of two reinforcement learning algorithms when solving reaching tasks using a simulated seven-degree-of-freedom robotic arm. The results of our study demonstrate that gravity compensation coupled with RL can reduce the training required in reaching tasks involving elevated target locations, but not all target locations.

Highlights

Autonomous robotic systems are widely recognized as a worthwhile technological goal for humanity to achieve
The specific number of time steps chosen for each run of Actor Critic Using Kronecker-Factored Trust Region (ACKTR) and PPO2 corresponded to the smallest number of time steps necessary to accumulate near the maximum reward using the best hyperparameter selection found in our initial experiments
We introduced an approach to combine reinforcement learning (RL) and control theory by adding terms used in a classical controller to the RL output

Summary

Introduction

Autonomous robotic systems are widely recognized as a worthwhile technological goal for humanity to achieve. Control theory provides a methodology for creating controllers for dynamical systems in order to accomplish a specified task [14]. One way to use the dynamics model in (1) is to use estimates for M, C, h, and G (denoted M , C, h, and G , respectively) to compensate for the system dynamics and create an auxillary control law for the system. The form of this control is shown in (2), where τcont is the auxillary torque. The choice of τcont is arbitrary and may be selected using a feedback control to stabilize the system or to drive the states to a desired value

Objectives

Methods

Results

Discussion

Conclusion