Learning positioning policies for mobile manipulation operations with deep reinforcement learning

Ander Iriondo,Elena Lazkano,Ander Ansuategi,Andoni Rivera,Iker Lluvia,Carlos Tubío

doi:10.1007/s13042-023-01815-8

Abstract

This work focuses on the operation of picking an object on a table with a mobile manipulator. We use deep reinforcement learning (DRL) to learn a positioning policy for the robot’s base by considering the reachability constraints of the arm. This work extends our first proof-of-concept with the ultimate goal of validating the method on a real robot. Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm is used to model the base controller, and is optimised using the feedback from the MoveIt! based arm planner. The idea is to encourage the base controller to position itself in areas where the arm reaches the object. Following a simulation-to-reality approach, first we create a realistic simulation of the robotic environment in Unity, and integrate it in Robot Operating System (ROS). The drivers for both the base and the arm are also implemented. The DRL-based agent is trained in simulation and, both the robot and target poses are randomised to make the learnt base controller robust to uncertainties. We propose a task-specific setup for TD3, which includes state/action spaces, reward function and neural architectures. We compare the proposed method with the baseline work and show that the combination of TD3 and the proposed setup leads to a 11% higher success rate than with the baseline, with an overall success rate of 97%. Finally, the learnt agent is deployed and validated in the real robotic system where we obtain a promising success rate of 75%.

Full Text