A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients

Ivo Grondman,Lucian Busoniu,Gabriel A D Lopes,Robert Babuska

doi:10.1109/tsmcc.2012.2218595

Abstract

Policy-gradient-based actor-critic algorithms are amongst the most popular algorithms in the reinforcement learning framework. Their advantage of being able to search for optimal policies using low-variance gradient estimates has made them useful in several real-life applications, such as robotics, power control, and finance. Although general surveys on reinforcement learning techniques already exist, no survey is specifically dedicated to actor-critic algorithms in particular. This paper, therefore, describes the state of the art of actor-critic algorithms, with a focus on methods that can work in an online setting and use function approximation in order to deal with continuous state and action spaces. After starting with a discussion on the concepts of reinforcement learning and the origins of actor-critic algorithms, this paper describes the workings of the natural gradient, which has made its way into many actor-critic algorithms over the past few years. A review of several standard and natural actor-critic algorithms is given, and the paper concludes with an overview of application areas and a discussion on open issues.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Systems, Man and Cybernetics, Part C (Applications and Reviews)

Lead the way for us

Journal: IEEE Transactions on Systems, Man and Cybernetics, Part C (Applications and Reviews)	Publication Date: Nov 1, 2012
Citations: 756

Similar Papers

Reward-Punishment Actor-Critic Algorithm Applying to Robotic Non-grasping Manipulation
Taisuke Kobayashi ... Takumi Aotani
-
Taisuke Kobayashi, et. al.Taisuke Kobayashi ... Takumi Aotani
01 Aug 2019
01 Aug 2019

Action decoupled SAC reinforcement learning with discrete-continuous hybrid action spaces
Yahao Xu ... Hongbin Deng
Neurocomputing | VOL. 537
Yahao Xu, et. al.Yahao Xu ... Hongbin Deng
31 Mar 2023
Neurocomputing | VOL. 537

On the sample complexity of actor-critic method for reinforcement learning with function approximation
Harshat Kumar ... Alejandro Ribeiro
Machine Learning | VOL. 112
Harshat Kumar, et. al.Harshat Kumar ... Alejandro Ribeiro
16 Feb 2023
Machine Learning | VOL. 112

Deep Deterministic Policy Gradient-based Parameter Selection Method of Notch Filters for Suppressing Mechanical Resonance in Industrial Servo Systems
Tae-Ho Oh ... Sang-Sub Lee
-
Tae-Ho Oh, et. al.Tae-Ho Oh ... Sang-Sub Lee
01 Aug 2019
01 Aug 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Systems, Man and Cybernetics, Part C (Applications and Reviews)