Abstract

In the multi-agent adversarial scenario, there are problems such as partial observability, exponential increase in the state space and action space of agents, non-stationary environment, and credit allocation. In order to solve the above problems,this paper proposes a value decomposition deep reinforcement learning algorithm QMIX-NA based on the reward query attention mechanism.The algorithm introduces batch regularization and attention mechanism to reduce the complexity of the algorithm and improve the performance of the algorithm. Finally, simulation experiments are carried out in the StarCraft 2 micro-management environment SMAC. The results show that the performance of the QMIX-NA algorithm is better than the traditional value decomposition deep reinforcement learning algorithm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call