Abstract

Deep reinforcement learning at the same time combines the perception of deep learning and the decision-making of reinforcement learning, is currently a hot research topic in the field of artificial intelligence. Multi-agent deep reinforcement learning applies the idea and algorithm of deep reinforcement learning to the learning and control of multi-agent system, which is an important method to develop multi-agent system with swarm agent. Multi-agent deep deterministic policy gradient(MADDPG) is the most popular model-free multi-agent reinforcement learning algorithm. To solve the problem of low learning and training efficiency and slow convergence speed of MADDPG due to the deterministic single action output of policy network, this paper combines the maximum reinforcement learning soft actor -critic algorithm to make each agent’s policy network output action with a random strategy and propose a multi-agent deep reinforcement learning algorithm MASAC based on maximum entropy. The experimental results show that the training speed of MASAC is better than that of MADDPG. At the same time, the learning agent has good performance, stable performance and strong anti-interference ability.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call