Deep Skill Chaining with Diversity for Multi-agent Systems*

Zaipeng Xie,Yufeng Zhang,Cheng Ji

doi:10.1007/978-3-031-20503-3_17

Abstract

AbstractMulti-agent reinforcement learning requires the reward signals given by the environment to guide the convergence of individual agents’ policy networks. However, in a high-dimensional continuous space, the non-stationary environment may provide outdated experiences that lead to the inability to converge. The existing methods can be ineffective in achieving a satisfactory training performance due to the inherent non-stationary property of the multi-agent system. We propose a novel reinforcement learning scheme, MADSC, to generate an optimized cooperative policy. Our scheme utilizes mutual information to evaluate the intrinsic reward function that can generate a cooperative policy based on the option framework. In addition, by linking the learned skills to form a skill chain, the convergence speed of agent learning can be significantly accelerated. Hence, multi-agent systems can benefit from MADSC to achieve strategic advantages by significantly reducing the learning steps. Experiments are performed on the SMAC multi-agent tasks with varying difficulties. Experimental results demonstrate that our proposed scheme can effectively outperform the state-of-the-art methods, including IQL, QMIX, and hDQN, with a single layer of temporal abstraction.KeywordsReinforcement learningMulti-agent systemsTemporal abstractionMutual informationSkill discovery

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Deep Skill Chaining with Diversity for Multi-agent Systems*

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Mutual information oriented deep skill chaining for multi‐agent reinforcement learning
Zaipeng Xie ... Cheng Ji
CAAI Transactions on Intelligence Technology | VOL. 9
Zaipeng Xie, et. al.Zaipeng Xie ... Cheng Ji
28 Mar 2024
CAAI Transactions on Intelligence Technology | VOL. 9

Stochastic Actor-Executor-Critic for Image-to-Image Translation
Ziwei Luo ... Xin Wang
-
Ziwei Luo, et. al.Ziwei Luo ... Xin Wang
01 Aug 2021
01 Aug 2021

Multiagent reinforcement learning applied to a chase problem in a continuous world
Hiroki Tamakoshi ... Shin Ishii
Artificial Life and Robotics | VOL. 5
Hiroki Tamakoshi, et. al.Hiroki Tamakoshi ... Shin Ishii
01 Dec 2001
Artificial Life and Robotics | VOL. 5

An End-to-End Automatic Cloud Database Tuning System Using Deep Reinforcement Learning
Ji Zhang ... Yangtao Wang
-
Ji Zhang, et. al.Ji Zhang ... Yangtao Wang
25 Jun 2019
25 Jun 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Deep Skill Chaining with Diversity for Multi-agent Systems*

Abstract

Talk to us

Similar Papers