Abstract
This article addresses the dynamic optimal design problem of the affine formation for adversarial multi-agent systems in real-time confrontation environments. To maximize the win rate and maximize the battle damage ratio, a novel deep reinforcement learning (DRL) algorithm named TD3-BC-PPO is designed to adaptively control the shape transformation of affine formation, including translation, rotation, and scaling. Specifically, Twin Delayed Deep Deterministic Policy Gradient (TD3) is introduced to enhance the exploratory aspect of the strategy and efficiently leverage the crucial sparse win–loss rewards. Behavior cloning (BC) is used for strategy replication and model migration. Proximal policy optimization (PPO) is utilized for stabilizing policy updates, conducting a secondary optimization of the strategy through dense damage-related rewards. To validate the effectiveness and robustness of the proposed algorithm, extensive simulation experiments are conducted. Comparative analyses are performed against several baselines, encompassing a diverse range of formation confrontation situations. These situations involve various initial formation structures, differing attack angles, unequal combat quantity and expanded confrontation space dimensions. Simulation results show that the proposed TD3-BC-PPO algorithm can engender an impressive surge of at least 5.4% in the win rate and an incremental ascent of at least 0.274 in the battle damage ratio of the affine formation for adversarial multi-agent systems in complex real-time confrontation scenarios.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.