Building a Connected Communication Network for UAV Clusters Using DE-MADDPG

Zixiong Zhu,Lei Chen,Nianhao Xie,Kang Zong

doi:10.3390/sym13081537

Abstract

Clusters of unmanned aerial vehicles (UAVs) are often used to perform complex tasks. In such clusters, the reliability of the communication network connecting the UAVs is an essential factor in their collective efficiency. Due to the complex wireless environment, however, communication malfunctions within the cluster are likely during the flight of UAVs. In such cases, it is important to control the cluster and rebuild the connected network. The asymmetry of the cluster topology also increases the complexity of the control mechanisms. The traditional control methods based on cluster consistency often rely on the motion information of the neighboring UAVs. The motion information, however, may become unavailable because of the interrupted communications. UAV control algorithms based on deep reinforcement learning have achieved outstanding results in many fields. Here, we propose a cluster control method based on the Decomposed Multi-Agent Deep Deterministic Policy Gradient (DE-MADDPG) to rebuild a communication network for UAV clusters. The DE-MADDPG improves the framework of the traditional multi-agent deep deterministic policy gradient (MADDPG) algorithm by decomposing the reward function. We further introduce the reward reshaping function to facilitate the convergence of the algorithm in sparse reward environments. To address the instability of the state-space in the reinforcement learning framework, we also propose the notion of the virtual leader–follower model. Extensive simulations show that the success rate of the DE-MADDPG is higher than that of the MADDPG algorithm, confirming the effectiveness of the proposed method.

Highlights

The increasing application of unmanned aerial vehicles (UAVs) has resulted in complex scenarios
We developed the DE-multi-agent deep deterministic policy gradient (MADDPG) algorithm for UAV clusters rebuilding a connected network, where a virtual navigator was proposed to address the instability issue of the state space
The simulation results confirmed the effectiveness of the proposed algorithm for controlling the UAV clusters and constructing a connected network, where the success rate was much higher than that of the MADDPG algorithm

Summary

Introduction

The increasing application of unmanned aerial vehicles (UAVs) has resulted in complex scenarios. Apart from military operations such as target destruction and cooperative investigation [1], civilian applications can benefit from UAV technology, such as environmental monitoring, precision agriculture [2,3], disaster relief [4], and traffic monitoring [5]. As it becomes increasingly difficult for a single UAV to meet the needs of complex tasks [6], the recent development of technologies such as cluster intelligence has enabled applications with small UAV cluster systems. Rebuilding a connected communication network for a cluster with communication problems is essential for the cooperation of UAVs [7].

Objectives

Methods

Results

Conclusion