Abstract

The growing demand for robots able to act autonomously in complex scenarios has widely accelerated the introduction of Reinforcement Learning (RL) in robots control applications. However, the trial and error intrinsic nature of RL may result in long training time on real robots and, moreover, it may lead to dangerous outcomes. While simulators are useful tools to accelerate RL training and to ensure safety, they often are provided only with an approximated model of robot dynamics and of its interaction with the surrounding environment, thus resulting in what is called the reality gap (RG): a mismatch of simulated and real control-law performances caused by the inaccurate representation of the real environment in simulation. The most undesirable result occurs when the controller learnt in simulation fails the task on the real robot, thus resulting in an unsuccessful sim-to-real transfer. The goal of the present survey is threefold: (1) to identify the main approaches to face the RG problem in the context of robot control with RL, (2) to point out their shortcomings, and (3) to outline new potential research areas.

Highlights

  • R EINFORCEMENT Learning (RL) [1] allows to design controllers with the capability of learning an optimal behaviour by interacting with the environment

  • On the other hand, when Deep Reinforcement Learning (DRL) is directly employed on the robot in real-time, it results in considerably long training times

  • Here we focus our attention only on those works facing the reality gap (RG) problem on robot controllers learnt with Reinforcement Learning (RL)

Read more

Summary

INTRODUCTION

R EINFORCEMENT Learning (RL) [1] allows to design controllers (often referred to as agents) with the capability of learning an optimal behaviour by interacting with the environment. An increase in system complexity is often related to an increase of the state and action spaces dimensions, which makes a tabular approach intractable In challenging cases, such as robot control, treating state and action as continuous variables in a compact set is a more appropriate way to deal with the problem [2, 3]. The aims of the present work are: (1) to provide a systematic picture of the literature concerning how to solve the RG problem in robot control tasks with RL; (2) to clarify the differences between the three main identified approaches by highlighting the relative pros and cons; (3) to identify new possible research areas.

REINFORCEMENT LEARNING
ADOPTED FORMALISM
METHODOLOGIES FOR SOLVING RG
DOMAIN RANDOMISATION
TRANSFER LEARNING
DISCUSSION AND PROMISING IDEAS
CONCLUSIONS AND OPEN CHALLENGES
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call