Abstract

Lifelong reinforcement learning (LRL) is an important approach to achieve continual lifelong learning of multiple reinforcement learning tasks. The two major methods used in LRL are task decomposition and policy knowledge extraction. Policy knowledge extraction method in LRL can share knowledge for tasks in different task domains and for tasks in the same task domain with different system environmental coefficients. However, the generalization ability of policy knowledge extraction method is limited on learned tasks rather than learned task domains. In this paper, we propose a cross-domain lifelong reinforcement learning algorithm with intra-domain knowledge generalization ability (CDLRL-DKG) to improve generalization ability of policy knowledge extraction method from learned tasks to learned task domains. In experiments, we evaluated CDLRL-DKG performance on three task domains. And our results show that the proposed algorithm can directly get approximate optimal policy from given dynamical system environmental coefficients in task domain without added cost of computing time and storage space.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call