Abstract

This paper presents an approach to learn and generalize robotic skills from a demonstration using deep reinforcement learning (deep RL). Dynamic Movement Primitives (DMPs) formulate a nonlinear differential equation and produce the observed movement from a demonstration. However, it is hard to generate new behaviors from using DMPs. Thus, we apply DMPs framework into deep RL as an initial setting for learning the robotic skills. First, we build a network to represent this differential equation, and learn and generalize the movements by optimizing the shape of DMPs with respect to the rewards up to the end of each sequence of movement primitives. In order to do this, we consider a deterministic actor-critic algorithm for deep RL and we also apply a hierarchical strategy. This drastically reduces the search space for a robot by decomposing the task, which allows to solve the sparse reward problem from a complex task. In order to integrate DMPs with hierarchical deep RL, the differential equation is considered as temporal abstraction of option. The overall structure is mainly composed of two controllers: meta-controller and sub-controller. The meta-controller learns a policy over intrinsic goals and a sub-controller learns a policy over actions to accomplish the given goals. We demonstrate our approach on a 6 degree-of-freedom (DOF) arm with a I-DOF gripper and evaluate our approach through a pick-and-place task.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call