UAV-to-Device Underlay Communications: Age of Information Minimization by Multi-Agent Deep Reinforcement Learning

Fanyi Wu,H Vincent Poor,Hongliang Zhang,Zhu Han,Jianjun Wu,Lingyang Song

doi:10.1109/tcomm.2021.3065135

Abstract

In recent years, unmanned aerial vehicles (UAVs) have unlocked numerous sensing applications, which are expected to add billions of dollars to the world economy in the next decade. To further improve the Quality-of-Service in these applications, the 3rd Generation Partnership Project has considered the use of terrestrial cellular networks to support UAV sensing services, also known as the cellular Internet of UAVs. In this paper, we consider a cellular Internet of UAVs, where the sensory data can be transmitted either to the base station via cellular links, or to the mobile devices by underlay UAV-to-Device (U2D) communications. To evaluate the freshness of the sensory data, the concept of age of information (AoI) is adopted, in which a lower AoI implies fresher data. Since UAVs' AoIs are determined by their trajectories during sensing and transmission, we investigate the AoI minimization problem for UAVs by designing their trajectories. This problem is a Markov decision problem with an infinite state-action space, and thus we utilize multi-agent deep reinforcement learning to approximate the state-action space. Then, we propose a multi-UAV trajectory design algorithm to solve this problem. Simulation results show that our proposed algorithm can achieve a lower AoI than a greedy algorithm, policy gradient algorithm, and overlay U2D scheme.

Full Text