回避行動の再利用メカニズムを備えた強化学習手法と多関節ロボットの全身運動学習への応用

Akihiko Yamaguchi,Norikazu Sugimoto,Mitsuo Kawato

doi:10.7210/jrsj.27.209

Abstract

In acquiring a motion only from its objective by learning, large cost, such as damage from falling over, and a large number of trials are required if the motion is a complex one, such as jumping serve. Reusing the knowledge already learnt is an essential mechanism to learn such motions efficiently, like humans do. In this paper, we propose a learning method to decompose action-value functions for reusing in the framework of reinforcement learning. Avoidance actions that are assumed invariant across different tasks (e.g. avoiding to fall over) are learnt separately from primary actions that are assumed task specific, then the action-value function for the avoidance actions is reused in learning new tasks. Furthermore, we extend the method for multi-link robots to learn whole body motions. The proposed method is applied for moving tasks both in discrete and continuous planes, and is also applied for a tennis-serve and a jump tasks of a 4-link robot. We also demonstrate a issue in reusing of the similar method, Q-decomposition [1]. The simulation results show an performance advantage of the proposed method over Q-decomposition in reusing avoidance actions.

Full Text