Learning to Fly: A Distributed Deep Reinforcement Learning Framework for Software-Defined UAV Network Control

Hai Cheng,Elizabeth Serena Bentley,Tommaso Melodia,John Buczek,Salvatore D'Oro,Lorenzo Bertizzolo

doi:10.1109/ojcoms.2021.3092690

Abstract

Control and performance optimization of wireless networks of Unmanned Aerial Vehicles (UAVs) require scalable approaches that go beyond architectures based on centralized network controllers. At the same time, the performance of model-based optimization approaches is often limited by the accuracy of the approximations and relaxations necessary to solve the UAV network control problem through convex optimization or similar techniques, and by the accuracy of the channel network models used. To address these challenges, this article introduces a new architectural framework to control and optimize UAV networks based on Deep Reinforcement Learning (DRL). Furthermore, it proposes a virtualized, `ready-to-fly' emulation environment to generate the extensive wireless data traces necessary to train DRL algorithms, which are notoriously hard to generate and collect on battery-powered UAV networks. The training environment integrates previously developed wireless protocol stacks for UAVs into the CORE/EMANE emulation tool. Our `ready-to-fly' virtual environment guarantees scalable collection of high-fidelity wireless traces that can be used to train DRL agents. The proposed DRL architecture enables distributed data-driven optimization (with up to 3.7 × throughput improvement and 0.2 × latency reduction in reported experiments), facilitates network reconfiguration, and provides a scalable solution for large UAV networks.

Highlights

U NMANNED Aerial Vehicle (UAV) networks are attracting the interest of the wireless community as a ‘tool’ to provide flexible and on-demand network infrastructure [1], [2]
UAV NETWORK CONTROL PROBLEM AS MULTI-AGENT Deep Reinforcement Learning (DRL) In this work, we model the control problem of a Network Operator (NO) willing to dictate the behavior of a distributed network of UAVs as a multi-agent DRL employing the Q-learning techniques introduced in the previous section
At the top of the figure, we show the cumulative distribution function (CDF) of the measured end-to-end network throughput, i.e., the objective function specified by the NO, for different control schemes

Summary

Introduction

U NMANNED Aerial Vehicle (UAV) networks are attracting the interest of the wireless community as a ‘tool’ to provide flexible and on-demand network infrastructure [1], [2]. Challenge (I) (Fully Wireless Access and Backhaul): UAV networks are fully wireless (i.e., access and backhaul) and their operations are extremely sensitive to spatially and temporally varying topologies and dynamic RF environments. Basic functionalities such as network formation and point-to-point communications are often impaired by unstable channel conditions and fragile network connectivity typical of infrastructure-less networked systems.

Objectives

Results

Conclusion