Latent go-explore with area as unit

Guopeng Jia,Junzhou Huo,Fan Yang,Bowen Yang

doi:10.1016/j.ipm.2023.103582

Abstract

The trade-off between exploration and exploitation has been one of the main challenges for ensuring sampling efficiency, optimal solution, and transferability of reinforcement learning. Based on the Go-Explore framework, which is currently the most effective framework for the environments with sparse reward, latent go-explore (LGE) can overcome the complexity of manually designing state features. However, its state feature space is not effective enough for measuring the sampling density, and the exploration mode with a single state as a unit is inefficient. To this end, this paper proposes the LGE with the state area as a unit, named ALGE, which can encode the real environment distance into the state feature space and realize the exploration mode with the state area as a unit to further improve exploring efficiency. The proposed ALGE is verified by a series of experiments in multiple hard-exploration environments including a continuous-maze environment, a robot environment and two Atari environments. The experiment results demonstrate that the ALGE can respectively improve the explored space coverage rate by 12 % and 27 % in the continuous-maze and robot environments and outperformance the state-of-the-art algorithms of go-explore framework in terms of pure exploration in these hard-exploration environments including Montezuma's Revenge and Pitfall.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Latent go-explore with area as unit

Abstract

Talk to us

Similar Papers

More From: Information Processing and Management

Lead the way for us

Similar Papers

Collaborative training of heterogeneous reinforcement learning agents in environments with sparse rewards: what and when to share?
Alain Andres ... Esther Villar-Rodriguez
Neural Computing and Applications | VOL. 35
Alain Andres, et. al.Alain Andres ... Esther Villar-Rodriguez
16 Sep 2022
Neural Computing and Applications | VOL. 35

Sparse Reward Based Manipulator Motion Planning by Using High Speed Learning from Demonstrations
Guoyu Zuo ... Tingting Pan
-
Guoyu Zuo, et. al.Guoyu Zuo ... Tingting Pan
01 Dec 2018
01 Dec 2018

Simultaneously Evolving Deep Reinforcement Learning Models using Multifactorial optimization
Aritz D Martinez ... Eneko Osaba
-
Aritz D Martinez, et. al.Aritz D Martinez ... Eneko Osaba
01 Jul 2020
01 Jul 2020

Learning with sparse reward in a gap junction network inspired by the insect mushroom body.
Tianqi Wei ... Barbara Webb
PLoS computational biology | VOL. 20
Tianqi Wei, et. al.Tianqi Wei ... Barbara Webb
23 May 2024
PLoS computational biology | VOL. 20

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Latent go-explore with area as unit

Abstract

Talk to us

Similar Papers

More From: Information Processing and Management