Deep Reinforcement Learning Based Resource Allocation and Trajectory Planning in Integrated Sensing and Communications UAV Network

Yunhui Qin,Wei Huangfu,Xulong Li,Haijun Zhang,Zhongshan Zhang

doi:10.1109/twc.2023.3260304

Abstract

In this paper, multi-UAVs serve as mobile aerial ISAC platforms to sense and communicate with on-ground target users. To optimize the communication and sensing performance, we formulate a joint user association, UAV trajectory planning and power allocation problem to maximize the minimum weighted spectral efficiency among UAVs. This paper exploits the centralized and the decentralized deep reinforcement learning (DRL) solutions to solve the sequential decision-making problem. On one hand, we first introduce the centralized soft actor-critic (SAC) algorithm. Then, we explore the equivalent transformation of the optimization objective based on symmetric group, propose the random and the adaptive data augmentation schemes to design the replay memory buffer of SAC, and accordingly propose SAC algorithms assisted by data augmentation to tackle the transformed problem. On the other hand, the multi-agent soft actor-critic (MASAC), a decentralized solution, is also introduced to solve this sequential decision-making problem. The experiment results reveal the effectiveness of the centralized and the decentralized solutions in considered scenarios. Specifically, the SAC assisted by the adaptive scheme significantly outperforms other centralized solutions in the training speed and the weighted spectral efficiency. Meanwhile, the decentralized MASAC algorithm behaves best in the early training speed.

Full Text