Guiding Evolutionary Strategies with Off-Policy Actor-Critic

Yunhao Tang

doi:10.48448/7mb4-cj72

Abstract

Evolutionary strategies (ES) and off-policy learning algorithms are two major workhorses of Reinforcement learning (RL): ES adopt a simple blackbox approach to optimization but it can be slightly more sample inefficient; off-policy learning is by design more sample efficient but the updates can be unstable. Motivated by their trade-offs, we propose CEM-ACER, a combination of Cross-entropy method, a standard ES algorithm, and Actor-critic with experience replay (ACER), an off-policy actor-critic algorithm. Our proposal relies on a key insight: off-policy algorithms provide a natural mechanism to efficiently evolve parameter populations as part of an ES algorithm. Across a wide range of benchmark control tasks, we show that CEM-ACER balances the strengths of CEM and ACER, leading to an algorithm that consistently outperforms its individual building blocks, as well as other competitive baseline algorithms.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Guiding Evolutionary Strategies with Off-Policy Actor-Critic

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Back to Basics: Benchmarking Canonical Evolution Strategies for Playing Atari
Patryk Chrabąszcz ... Ilya Loshchilov
-
Patryk Chrabąszcz, et. al.Patryk Chrabąszcz ... Ilya Loshchilov
01 Jul 2018
01 Jul 2018

Comparison of efficiency between differential evolution and evolution strategy: application of the LST model to the Be River catchment in Vietnam
Nguyen Thi Thuy Hang ... Hidetaka Chikamori
Paddy and Water Environment | VOL. 15
Nguyen Thi Thuy Hang, et. al.Nguyen Thi Thuy Hang ... Hidetaka Chikamori
10 Apr 2017
Paddy and Water Environment | VOL. 15

Application of evolutionary algorithms to optimise one- and two-dimensional gradient chromatographic separations.
Bram Huygens ... Ann Nowé
Journal of Chromatography A | VOL. 1628
Bram Huygens, et. al.Bram Huygens ... Ann Nowé
28 Jul 2020
Journal of Chromatography A | VOL. 1628

A Study on Self-adaptation in the Evolutionary Strategy Algorithm
Noureddine Boukhari ... Mohamed Slimane
-
Noureddine Boukhari, et. al.Noureddine Boukhari ... Mohamed Slimane
01 Jan 2018
01 Jan 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Guiding Evolutionary Strategies with Off-Policy Actor-Critic

Abstract

Talk to us

Similar Papers