Abstract

Curriculum learning has the potential to solve the problem of sparse rewards, a long-standing challenge in reinforcement learning, with greater sample efficiency than traditional reinforcement learning algorithms because curriculum learning enables agents to learn tasks in a meaningful order: from simple tasks to difficult ones. However, most curriculum learning in RL still relies on fixed hand-designed sequences of tasks. We present a novel scheme of automatic curriculum learning for reinforcement learning agents. A two-level hierarchical reinforcement learning framework, with a high-level policy called the curriculum generator and a low-level policy called the action policy, is proposed. During training, the curriculum generator automatically proposes curricula for the action policy to learn. Our training methods guarantee that the proposed curricula are always moderately difficult for the action policy. Both levels of policies are trained simultaneously and independently. After training, the low-level policy will be able to finish all tasks without the instructions given by the curriculum generator. Experiment results on a wide range of benchmark robotics environments demonstrate that our method accelerates convergence considerably and improves the training quality compared with the method without the curriculum generator.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call