Abstract

Multi-agent systems have recently received significant attention from researchers in many scientific fields. The value factorization method is popular for scaling up cooperative reinforcement learning in multi-agent environments. However, the approximation of the joint value function may introduce a significant disparity between the estimated and actual joint reward value function, leading to a local optimum for cooperative multi-agent deep reinforcement learning. In addition, as the number of agents increases, the input space grows exponentially, negatively impacting the convergence performance of multi-agent algorithms. This work proposes an efficient multi-agent rein-forcement learning algorithm, QDN, to enhance the convergence performance in cooperative multi-agent tasks. The proposed QDN scheme utilizes a competitive network to enable the agents to learn the value of the environmental state without the influence of actions. Hence, the error between the estimated joint reward value function and the actual joint reward value function can be significantly reduced, preventing the emergence of sub-optimal actions. Meanwhile, the proposed QDN algorithm utilizes the parametric noise on the network weights to introduce random-ness in the network's weights so that the agents can explore the environments and states effectively, thereby improving the convergence performance of the QDN algorithm. We evaluate the proposed QDN scheme using the SMAC challenges with various map difficulties. Experimental results show that the QDN algorithm excels in the convergence speed and the success rate in all scenarios compared to some state-of-the-art methods. Further experiments using four additional multi-agent tasks demonstrate that the QDN algorithm is robust in various multi-agent tasks and can significantly improve the training convergence performance compared with the state-of-the-art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call