Stackelberg Actor-Critic: Game-Theoretic Reinforcement Learning Algorithms

Liyuan Zheng,Tanner Fiez,Lillian J Ratliff,Benjamin Chasnov,Zane Alumbaugh

doi:10.1609/aaai.v36i8.20908

Abstract

The hierarchical interaction between the actor and critic in actor-critic based reinforcement learning algorithms naturally lends itself to a game-theoretic interpretation. We adopt this viewpoint and model the actor and critic interaction as a two-player general-sum game with a leader-follower structure known as a Stackelberg game. Given this abstraction, we propose a meta-framework for Stackelberg actor-critic algorithms where the leader player follows the total derivative of its objective instead of the usual individual gradient. From a theoretical standpoint, we develop a policy gradient theorem for the refined update and provide a local convergence guarantee for the Stackelberg actor-critic algorithms to a local Stackelberg equilibrium. From an empirical standpoint, we demonstrate via simple examples that the learning dynamics we study mitigate cycling and accelerate convergence compared to the usual gradient dynamics given cost structures induced by actor-critic formulations. Finally, extensive experiments on OpenAI gym environments show that Stackelberg actor-critic algorithms always perform at least as well and often significantly outperform the standard actor-critic algorithm counterparts.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Stackelberg Actor-Critic: Game-Theoretic Reinforcement Learning Algorithms

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence	Publication Date: Jun 28, 2022
Citations: 8

Similar Papers

Evolutionary Anti-Jamming Game in Non-Orthogonal Multiple Access System
Yue Bi ... Yue Wu
-
Yue Bi, et. al.Yue Bi ... Yue Wu
01 Dec 2019
01 Dec 2019

Reward-Punishment Actor-Critic Algorithm Applying to Robotic Non-grasping Manipulation
Taisuke Kobayashi ... Takumi Aotani
-
Taisuke Kobayashi, et. al.Taisuke Kobayashi ... Takumi Aotani
01 Aug 2019
01 Aug 2019

An Advanced Actor-Critic Algorithm for Training Video Game AI
Zhongyi Zha ... Bo Wang
-
Zhongyi Zha, et. al.Zhongyi Zha ... Bo Wang
01 Jan 2020
01 Jan 2020

Reinforcement Learning for Dynamic Pricing of Shared-Use Autonomous Mobility Systems Considering Heterogeneous Users: Model Development and Scenario Testing
Hoseb Abkarian ... Hani Mahmassani
Transportation Research Record: Journal of the Transportation Research Board | VOL. 2678
Hoseb Abkarian, et. al.Hoseb Abkarian ... Hani Mahmassani
29 Aug 2023
Transportation Research Record: Journal of the Transportation Research Board | VOL. 2678

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Stackelberg Actor-Critic: Game-Theoretic Reinforcement Learning Algorithms

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence