Target encirclement of multiple unmanned surface vessels (USVs) is a prevalent tactical strategy in maritime defense missions, significantly enhancing defense efficiency. However, challenges arise due to the limited collaborative capacity stemming from underutilized information within multi-USV systems during such tasks. In this context, a multi-USV target encirclement based on an improved multi-head attention Q-value mixing network is proposed. Initially, a reinforcement learning model tailored for USVs is designed, taking into account the complexities inherent in encirclement tasks. Subsequently, by incorporating distinct action semantics for value calculation, USVs are empowered to accurately assess action values, thereby improving decision-making processes. Leveraging the Qatten algorithm framework, a multi-USV target encirclement method with weighted action semantic-assisted value function decomposition is introduced. Comparative and ablation experiments conducted in uniform-speed target encirclement scenarios validate the effectiveness and high success rate of the proposed method.
Read full abstract