Abstract

With shaped reward functions, reinforcement learning (RL) has recently been successfully applied to several robot control tasks. However, designing a task-relevant and well-performing reward function takes time and effort. Still, if RL can train an agent to complete a task in a sparse reward environment, it is an effective way to address the difficulty of reward function design, but it is still a significant challenge. To address this issue, the pioneering hindsight experience replay (HER) method dramatically enhances the probability of acquiring skills in sparse reward environments by transforming unsuccessful experiences into helpful training samples. However, HER still requires a lengthy training period. In this article, we propose a new technique based on HER termed adaptive HER with goal-amended curiosity module (AHEGC) for further enhancing sample and exploration efficiency. Specifically, an adaptive adjustment strategy of hindsight experience (HE) sampling rate and reward weights is developed to enhance sample efficiency. Furthermore, we introduce a curiosity mechanism to encourage more efficient exploration of the environment and propose a goal-amended (GA) curiosity module as a solution to the problem of over-seeking novelty caused by the curiosity introduced. We conducted experiments on six demanding robot control tasks with binary rewards, including Fetch and Hand environments. The results show that the proposed method outperforms existing methods regarding learning ability and convergence speed.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call