Multi-agent Deep Reinforcement Learning based on Maximum Entropy

Zihao Wang,Yanxin Zhang,Zhiqing Huang,Chenkun Yin

doi:10.1109/imcec51613.2021.9482235

Abstract

Deep reinforcement learning at the same time combines the perception of deep learning and the decision-making of reinforcement learning, is currently a hot research topic in the field of artificial intelligence. Multi-agent deep reinforcement learning applies the idea and algorithm of deep reinforcement learning to the learning and control of multi-agent system, which is an important method to develop multi-agent system with swarm agent. Multi-agent deep deterministic policy gradient(MADDPG) is the most popular model-free multi-agent reinforcement learning algorithm. To solve the problem of low learning and training efficiency and slow convergence speed of MADDPG due to the deterministic single action output of policy network, this paper combines the maximum reinforcement learning soft actor -critic algorithm to make each agent’s policy network output action with a random strategy and propose a multi-agent deep reinforcement learning algorithm MASAC based on maximum entropy. The experimental results show that the training speed of MASAC is better than that of MADDPG. At the same time, the learning agent has good performance, stable performance and strong anti-interference ability.

Full Text