Research on Wargame Decision-Making Method Based on Multi-Agent Deep Deterministic Policy Gradient

Sheng Yu,Wei Zhu,Yong Wang

doi:10.3390/app13074569

Abstract

Wargames are essential simulators for various war scenarios. However, the increasing pace of warfare has rendered traditional wargame decision-making methods inadequate. To address this challenge, wargame-assisted decision-making methods that leverage artificial intelligence techniques, notably reinforcement learning, have emerged as a promising solution. The current wargame environment is beset by a large decision space and sparse rewards, presenting obstacles to optimizing decision-making methods. To overcome these hurdles, a Multi-Agent Deep Deterministic Policy Gradient (MADDPG) based wargame decision-making method is presented. The Partially Observable Markov Decision Process (POMDP), joint action-value function, and the Gumbel-Softmax estimator are applied to optimize MADDPG in order to adapt to the wargame environment. Furthermore, a wargame decision-making method based on the improved MADDPG algorithm is proposed. Using supervised learning in the proposed approach, the training efficiency is improved and the space for manipulation before the reinforcement learning phase is reduced. In addition, a policy gradient estimator is incorporated to reduce the action space and to obtain the global optimal solution. Furthermore, an additional reward function is designed to address the sparse reward problem. The experimental results demonstrate that our proposed wargame decision-making method outperforms the pre-optimization algorithm and other algorithms based on the AC framework in the wargame environment. Our approach offers a promising solution to the challenging problem of decision-making in wargame scenarios, particularly given the increasing speed and complexity of modern warfare.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Applied Sciences	Publication Date: Apr 4, 2023
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Research on Wargame Decision-Making Method Based on Multi-Agent Deep Deterministic Policy Gradient

Abstract

Talk to us

Similar Papers

More From: Applied Sciences

Lead the way for us

Similar Papers

Decentralized Decision for Multi-Band Sensing: A Deep Reinforcement Learning Approach
Li Li ... Shaoqian Li
IEEE Wireless Communications Letters | VOL. 10
Li Li, et. al.Li Li ... Shaoqian Li
01 Dec 2021
IEEE Wireless Communications Letters | VOL. 10

Multi-Agent Collaborative Target Search Based on the Multi-Agent Deep Deterministic Policy Gradient with Emotional Intrinsic Motivation
Xiaoping Zhang ... Li Wang
Applied Sciences | VOL. 13
Xiaoping Zhang, et. al.Xiaoping Zhang ... Li Wang
01 Nov 2023
Applied Sciences | VOL. 13

A Friend-or-Foe framework for multi-agent reinforcement learning policy generation in mixing cooperative–competitive scenarios
Yu Sun ... Jun Lai
Transactions of the Institute of Measurement and Control | VOL. 44
Yu Sun, et. al.Yu Sun ... Jun Lai
29 Mar 2022
Transactions of the Institute of Measurement and Control | VOL. 44

Multi-agent Reinforcement Learning for a Special Formation Problem
Changsheng Qu ... Liangjun Ke
-
Changsheng Qu, et. al.Changsheng Qu ... Liangjun Ke
29 Jul 2022
29 Jul 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Research on Wargame Decision-Making Method Based on Multi-Agent Deep Deterministic Policy Gradient

Abstract

Talk to us

Similar Papers

More From: Applied Sciences