Abstract

In acquiring a motion only from its objective by learning, large cost, such as damage from falling over, and a large number of trials are required if the motion is a complex one, such as jumping serve. Reusing the knowledge already learnt is an essential mechanism to learn such motions efficiently, like humans do. In this paper, we propose a learning method to decompose action-value functions for reusing in the framework of reinforcement learning. Avoidance actions that are assumed invariant across different tasks (e.g. avoiding to fall over) are learnt separately from primary actions that are assumed task specific, then the action-value function for the avoidance actions is reused in learning new tasks. Furthermore, we extend the method for multi-link robots to learn whole body motions. The proposed method is applied for moving tasks both in discrete and continuous planes, and is also applied for a tennis-serve and a jump tasks of a 4-link robot. We also demonstrate a issue in reusing of the similar method, Q-decomposition [1]. The simulation results show an performance advantage of the proposed method over Q-decomposition in reusing avoidance actions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call