Synthetic Experiences for Accelerating DQN Performance in Discrete Non-Deterministic Environments

Wenzel Pilar Von Pilchau,Jörg Hähner,Anthony Stein

doi:10.3390/a14080226

Wenzel Pilar Von Pilchau, Jörg Hähner + Show 1 more

Open Access

https://doi.org/10.3390/a14080226

Copy DOI

Journal: Algorithms	Publication Date: Jul 27, 2021
Citations: 3	License type: CC BY 4.0

Affiliation: University of Augsburg, University of Hohenheim

Abstract

State-of-the-art Deep Reinforcement Learning Algorithms such as DQN and DDPG use the concept of a replay buffer called Experience Replay. The default usage contains only the experiences that have been gathered over the runtime. We propose a method called Interpolated Experience Replay that uses stored (real) transitions to create synthetic ones to assist the learner. In this first approach to this field, we limit ourselves to discrete and non-deterministic environments and use a simple equally weighted average of the reward in combination with observed follow-up states. We could demonstrate a significantly improved overall mean average in comparison to a DQN network with vanilla Experience Replay on the discrete and non-deterministic FrozenLake8x8-v0 environment.

Highlights

In the domain of Deep Reinforcement Learning (RL), the concept known as Experience Replay (ER) has long since developed to become a well-known standard for many algorithms [1,2,3]
Our ER version is targeted to improve the performance of these algorithms in nondeterministic and discrete environments
We presented an extension for the classic ER used in Deep RL that includes synthetic experiences to speed up and improve learning in non-deterministic and discrete environments

Summary

Introduction

In the domain of Deep Reinforcement Learning (RL), the concept known as Experience Replay (ER) has long since developed to become a well-known standard for many algorithms [1,2,3]. There are other approaches as well, and these extensions focus on the usage and creation of experiences that are synthetic in some way An example of this is the so-called Hindsight Experience Replay [3] that saves trajectories of states and actions together with a corresponding goal. We start with introducing the idea of the Experience Replay and continue with the presentation of Deep Reinforcement Learning basics as well as an explanation of why the former concept is mandatory here. In an non-episodic/infinite environment (and in an episodic one after enough time has gone by), we would run into the problem of limited storage To counteract this issue, the vanilla ER is realized via a FiFo buffer, and old experiences are thrown away after reaching the maximum length

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Synthetic Experiences for Accelerating DQN Performance in Discrete Non-Deterministic Environments

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Algorithms

Lead the way for us

Similar Papers

Harnessing deep reinforcement learning algorithms for image categorization: A multi algorithm approach
Dhanvanth Reddy Yerramreddy ... Don S
Engineering Applications of Artificial Intelligence | VOL. 136
Dhanvanth Reddy Yerramreddy, et. al.Dhanvanth Reddy Yerramreddy ... Don S
17 Jul 2024
Engineering Applications of Artificial Intelligence | VOL. 136

Unmanned Ground Vehicle Path Planning Based on Improved DRL Algorithm
Lisang Liu ... Youyuan Zhang
Electronics | VOL. 13
Lisang Liu, et. al.Lisang Liu ... Youyuan Zhang
25 Jun 2024
Electronics | VOL. 13

A novel multi-step Q-learning method to improve data efficiency for deep reinforcement learning
Yinlong Yuan ... Yuanqing Li
Knowledge-Based Systems | VOL. 175
Yinlong Yuan, et. al.Yinlong Yuan ... Yuanqing Li
21 Mar 2019
Knowledge-Based Systems | VOL. 175

An explainable deep reinforcement learning algorithm for the parameter configuration and adjustment in the consortium blockchain
Zhonghao Zhai ... Yanqin Mao
Engineering Applications of Artificial Intelligence | VOL. 129
Zhonghao Zhai, et. al.Zhonghao Zhai ... Yanqin Mao
30 Nov 2023
Engineering Applications of Artificial Intelligence | VOL. 129

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Synthetic Experiences for Accelerating DQN Performance in Discrete Non-Deterministic Environments

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Algorithms