Abstract

In deep reinforcement learning, agent exploration still has certain limitations, while low efficiency exploration further leads to the problem of low sample efficiency. In order to solve the exploration dilemma caused by white noise interference and the separation derailment problem in the environment, we present an innovative approach by introducing an intricately honed feature extraction module to harness the predictive errors, generate intrinsic rewards, and use an ancillary agent training paradigm that effectively solves the above problems and significantly enhances the agent’s capacity for comprehensive exploration within environments characterized by sparse reward distribution. The efficacy of the optimized feature extraction module is substantiated through comparative experiments conducted within the arduous exploration problem scenarios often employed in reinforcement learning investigations. Furthermore, a comprehensive performance analysis of our method is executed within the esteemed Atari 2600 experimental setting, yielding noteworthy advancements in performance and showcasing the attainment of superior outcomes in six selected experimental environments.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call