QDN: An Efficient Value Decomposition Method for Cooperative Multi-agent Deep Reinforcement Learning

Zaipeng Xie,Yufeng Zhang,Pengfei Shao,Weiyi Zhao

doi:10.1109/ictai56018.2022.00183

Abstract

Multi-agent systems have recently received significant attention from researchers in many scientific fields. The value factorization method is popular for scaling up cooperative reinforcement learning in multi-agent environments. However, the approximation of the joint value function may introduce a significant disparity between the estimated and actual joint reward value function, leading to a local optimum for cooperative multi-agent deep reinforcement learning. In addition, as the number of agents increases, the input space grows exponentially, negatively impacting the convergence performance of multi-agent algorithms. This work proposes an efficient multi-agent rein-forcement learning algorithm, QDN, to enhance the convergence performance in cooperative multi-agent tasks. The proposed QDN scheme utilizes a competitive network to enable the agents to learn the value of the environmental state without the influence of actions. Hence, the error between the estimated joint reward value function and the actual joint reward value function can be significantly reduced, preventing the emergence of sub-optimal actions. Meanwhile, the proposed QDN algorithm utilizes the parametric noise on the network weights to introduce random-ness in the network's weights so that the agents can explore the environments and states effectively, thereby improving the convergence performance of the QDN algorithm. We evaluate the proposed QDN scheme using the SMAC challenges with various map difficulties. Experimental results show that the QDN algorithm excels in the convergence speed and the success rate in all scenarios compared to some state-of-the-art methods. Further experiments using four additional multi-agent tasks demonstrate that the QDN algorithm is robust in various multi-agent tasks and can significantly improve the training convergence performance compared with the state-of-the-art methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

QDN: An Efficient Value Decomposition Method for Cooperative Multi-agent Deep Reinforcement Learning

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

A Survey of Deep Reinforcement Learning Based on Multi-Particle Environments
Chuhao Weng
Highlights in Science, Engineering and Technology | VOL. 85
Chuhao WengChuhao Weng
13 Mar 2024
Highlights in Science, Engineering and Technology | VOL. 85

The cooperative reinforcement learning in a multi-agent design system
Hong Liu ... Jihua Wang
-
Hong Liu, et. al.Hong Liu ... Jihua Wang
01 Jun 2013
01 Jun 2013

Cooperative Strategy Learning in Multi-Agent Environment with Continuous State Space
Jun-Yuan Tao ... De-Sheng Li
-
Jun-Yuan Tao, et. al.Jun-Yuan Tao ... De-Sheng Li
01 Jan 2006
01 Jan 2006

Tracking Learning Based on Gaussian Regression for Multi-agent Systems in Continuous Space
Xin Chen ... Hai-Jun Wei
Acta Automatica Sinica | VOL. 39
Xin Chen, et. al.Xin Chen ... Hai-Jun Wei
28 Mar 2014
Acta Automatica Sinica | VOL. 39

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

QDN: An Efficient Value Decomposition Method for Cooperative Multi-agent Deep Reinforcement Learning

Abstract

Talk to us

Similar Papers