ERL-TD: Evolutionary Reinforcement Learning Enhanced with Truncated Variance and Distillation Mutation

Qiuzhen Lin,Lijia Ma,Jianqiang Li,Wei-Neng Chen,Yangfan Chen

doi:10.1609/aaai.v38i12.29289

Qiuzhen Lin, Lijia Ma + Show 3 more

Open Access

PDF Available

https://doi.org/10.1609/aaai.v38i12.29289

Copy DOI

Export

Save

Cite

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

Recently, an emerging research direction called Evolutionary Reinforcement Learning (ERL) has been proposed, which combines evolutionary algorithm with reinforcement learning (RL) for tackling the tasks of sequential decision making. However, the recently proposed ERL algorithms often suffer from two challenges: the inaccuracy of policy estimation caused by the overestimation bias in RL and the insufficiency of exploration caused by inefficient mutations. To alleviate these problems, we propose an Evolutionary Reinforcement Learning algorithm enhanced with Truncated variance and Distillation mutation, called ERL-TD. We utilize multiple Q-networks to evaluate state-action pairs, so that multiple networks can provide more accurate evaluations for state-action pairs, in which the variance of evaluations can be adopted to control the overestimation bias in RL. Moreover, we propose a new distillation mutation to provide a promising mutation direction, which is different from traditional mutation generating a large number of random solutions. We evaluate ERL-TD on the continuous control benchmarks from the OpenAI Gym and DeepMind Control Suite. The experiments show that ERL-TD shows excellent performance and outperforms all baseline RL algorithms on the test suites.

Full Text