A Method for Catastrophic Forgetting Prevention during Multitasking Reinforcement Learning

I N Agliukov,K V Sviatov,S V Sukhov

doi:10.17587/mau.23.414-419

Abstract

Reinforcement learning is based on a principle of an agent interacting with an environment in order to maximize the amount of reward. Reinforcement learning shows amazing results in solving various control problems. However, the attempts to train a multitasking agent suffer from the problem of so-called "catastrophic forgetting": the knowledge gained by the agent about one task is erased during developing the correct strategy to solve another task. One of the methods to fight catastrophic forgetting during multitask learning assumes storing previously encountered states in, the so-called, experience replay buffer. We developed the method allowing a student agent to exchange an experience with teacher agents using an experience replay buffer. The procedure of experience exchange allowed the student to behave effectively in several environments simultaneously. The experience exchange was based on knowledge distillation that allowed to reduce the off-policy reinforcement learning problem to the supervised learning task. We tested several combinations of loss functions and output transforming functions. Distillation of knowledge requires a massive experience replay buffer. Several solutions to the problems of optimizing the size of the experience replay buffer are suggested. The first approach is based on the use of a subset of the whole buffer; the second approach uses the autoencoder as a tool to convert states to the latent space. Although our methods can be applied to a wide range of problems, we use Atari games as a testing environment to demonstrate the methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Method for Catastrophic Forgetting Prevention during Multitasking Reinforcement Learning

Abstract

Talk to us

Similar Papers

More From: Mekhatronika, Avtomatizatsiya, Upravlenie

Lead the way for us

Similar Papers

A novel DDPG method with prioritized experience replay
Yuenan Hou ... Lifeng Liu
-
Yuenan Hou, et. al.Yuenan Hou ... Lifeng Liu
01 Oct 2017
01 Oct 2017

Model & Feature Agnostic Eye-in-Hand Visual Servoing using Deep Reinforcement Learning with Prioritized Experience Replay
Prerna Singh ... Virender Singh
-
Prerna Singh, et. al.Prerna Singh ... Virender Singh
01 Oct 2019
01 Oct 2019

Reinforcement Learning for Distributed Control and Multi-player Games
Bahare Kiumarsi ... Frank Lewis
-
Bahare Kiumarsi, et. al.Bahare Kiumarsi ... Frank Lewis
01 Jan 2020
01 Jan 2020

Eliminating Primacy Bias in Online Reinforcement Learning by Self-Distillation.
Jingchen Li ... Huarui Wu
IEEE transactions on neural networks and learning systems | VOL. PP
Jingchen Li, et. al.Jingchen Li ... Huarui Wu
17 May 2024
IEEE transactions on neural networks and learning systems | VOL. PP

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Method for Catastrophic Forgetting Prevention during Multitasking Reinforcement Learning

Abstract

Talk to us

Similar Papers

More From: Mekhatronika, Avtomatizatsiya, Upravlenie