Abstract
Reinforcement learning is based on a principle of an agent interacting with an environment in order to maximize the amount of reward. Reinforcement learning shows amazing results in solving various control problems. However, the attempts to train a multitasking agent suffer from the problem of so-called "catastrophic forgetting": the knowledge gained by the agent about one task is erased during developing the correct strategy to solve another task. One of the methods to fight catastrophic forgetting during multitask learning assumes storing previously encountered states in, the so-called, experience replay buffer. We developed the method allowing a student agent to exchange an experience with teacher agents using an experience replay buffer. The procedure of experience exchange allowed the student to behave effectively in several environments simultaneously. The experience exchange was based on knowledge distillation that allowed to reduce the off-policy reinforcement learning problem to the supervised learning task. We tested several combinations of loss functions and output transforming functions. Distillation of knowledge requires a massive experience replay buffer. Several solutions to the problems of optimizing the size of the experience replay buffer are suggested. The first approach is based on the use of a subset of the whole buffer; the second approach uses the autoencoder as a tool to convert states to the latent space. Although our methods can be applied to a wide range of problems, we use Atari games as a testing environment to demonstrate the methods.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.