Abstract

Improving the generalization ability of reinforcement learning (RL) agents is an open and challenging problem and has gradually received attention in recent years. Previous work attempts to partially solve this problem by recombining several learned policies to achieve compositional generalization. However, these methods are usually limited to tasks that are composed of fixed subtasks with no task-agnostic environment change. In this paper, we propose a more flexible method termed MER to learn a compositionally generalizable policy that depends on task-dependent invariant elements among tasks, instead of depending on fixed subtasks. As a result, our learned policy can overcome the task-agnostic change in the environment and generalize to more different compositional tasks. Theoretical analysis and experimental results show that our method outperforms traditional methods and exhibits superior generalization ability in unseen new tasks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.