Abstract

Transfer learning is a hierarchical approach to reinforcement learning of complex tasks modeled as Markov Decision Processes. The learning results on the source task are used as the starting point for the learning on the target task. In this paper we deal with a hierarchy of constrained systems, where the source task is an under-constrained system, hence called the Partially Constrained Model (PCM). Constraints in the framework of reinforcement learning are dealt with by state-action veto policies. We propose a theoretical background for the hierarchy of training refinements, showing that the effective action repertoires learnt on the PCM are maximal, and that the PCM-optimal policy gives maximal state value functions. We apply the approach to learn the control of Linked Multicomponent Robotic Systems using Reinforcement Learning. The paradigmatic example is the transportation of a hose. The system has strong physical constraints and a large state space. Learning experiments in the target task are realized over an accurate but computationally expensive simulation of the hose dynamics. The PCM is obtained simplifying the hose model. Learning results of the PCM Transfer Learning show an spectacular improvement over conventional Q-learning on the target task.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call