Robust Multi-Agent Reinforcement Learning via Minimax Deep Deterministic Policy Gradient

Shihui Li,Xinyue Cui,Yi Wu,Fei Fang,Honghua Dong,Stuart Russell

doi:10.1609/aaai.v33i01.33014213

Abstract

Despite the recent advances of deep reinforcement learning (DRL), agents trained by DRL tend to be brittle and sensitive to the training environment, especially in the multi-agent scenarios. In the multi-agent setting, a DRL agent’s policy can easily get stuck in a poor local optima w.r.t. its training partners – the learned policy may be only locally optimal to other agents’ current policies. In this paper, we focus on the problem of training robust DRL agents with continuous actions in the multi-agent learning setting so that the trained agents can still generalize when its opponents’ policies alter. To tackle this problem, we proposed a new algorithm, MiniMax Multi-agent Deep Deterministic Policy Gradient (M3DDPG) with the following contributions: (1) we introduce a minimax extension of the popular multi-agent deep deterministic policy gradient algorithm (MADDPG), for robust policy learning; (2) since the continuous action space leads to computational intractability in our minimax learning objective, we propose Multi-Agent Adversarial Learning (MAAL) to efficiently solve our proposed formulation. We empirically evaluate our M3DDPG algorithm in four mixed cooperative and competitive multi-agent environments and the agents trained by our method significantly outperforms existing baselines.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Robust Multi-Agent Reinforcement Learning via Minimax Deep Deterministic Policy Gradient

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Jul 17, 2019
Citations: 172

Similar Papers

Distributed Power Allocation for 6-GHz Unlicensed Spectrum Sharing via Multi-agent Deep Reinforcement Learning
Xiang Zhang ... Sneha Kumar Kasera
-
Xiang Zhang, et. al.Xiang Zhang ... Sneha Kumar Kasera
04 Apr 2023
04 Apr 2023

A Collaborative Control Method of Dual-Arm Robots Based on Deep Reinforcement Learning
Luyu Liu ... Qianyuan Liu
Applied Sciences | VOL. 11
Luyu Liu, et. al.Luyu Liu ... Qianyuan Liu
18 Feb 2021
Applied Sciences | VOL. 11

A Confrontation Decision-Making Method with Deep Reinforcement Learning and Knowledge Transfer for Multi-Agent System
Chunyang Hu
Symmetry | VOL. 12
Chunyang HuChunyang Hu
16 Apr 2020
Symmetry | VOL. 12

Friend-or-Foe Deep Deterministic Policy Gradient
Hao Jiang ... Yajie Wang
-
Hao Jiang, et. al.Hao Jiang ... Yajie Wang
11 Oct 2020
11 Oct 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Robust Multi-Agent Reinforcement Learning via Minimax Deep Deterministic Policy Gradient

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence