Hierarchical Active Tracking Control for UAVs via Deep Reinforcement Learning

Wenlong Zhao,Jiahui Zhang,Zhijun Meng,Kaipeng Wang,Shaoze Lu

doi:10.3390/app112210595

Abstract

Active tracking control is essential for UAVs to perform autonomous operations in GPS-denied environments. In the active tracking task, UAVs take high-dimensional raw images as input and execute motor actions to actively follow the dynamic target. Most research focuses on three-stage methods, which entail perception first, followed by high-level decision-making based on extracted spatial information of the dynamic target, and then UAV movement control, using a low-level dynamic controller. Perception methods based on deep neural networks are powerful but require considerable effort for manual ground truth labeling. Instead, we unify the perception and decision-making stages using a high-level controller and then leverage deep reinforcement learning to learn the mapping from raw images to the high-level action commands in the V-REP-based environment, where simulation data are infinite and inexpensive. This end-to-end method also has the advantages of a small parameter size and reduced effort requirements for parameter turning in the decision-making stage. The high-level controller, which has a novel architecture, explicitly encodes the spatial and temporal features of the dynamic target. Auxiliary segmentation and motion-in-depth losses are introduced to generate denser training signals for the high-level controller’s fast and stable training. The high-level controller and a conventional low-level PID controller constitute our hierarchical active tracking control framework for the UAVs’ active tracking task. Simulation experiments show that our controller trained with several augmentation techniques sufficiently generalizes dynamic targets with random appearances and velocities, and achieves significantly better performance, compared with three-stage methods.

Highlights

Unmanned aerial vehicles (UAVs) are becoming an ideal platform to execute dirty and dangerous tasks, due to their high agility and low cost
We focus on the active tracking task of UAVs
Unlike the latent flow method in [29], which fuses raw images and their differences directly, we focus on our target of interest and first extract the spatial features of three sequential raw images by the shared convolutional block attention module and concatenate the spatial features and their differences

Summary

Introduction

Unmanned aerial vehicles (UAVs) are becoming an ideal platform to execute dirty and dangerous tasks, due to their high agility and low cost. Perception and control are the two key modules of autonomous UAVs. Without these, UAVs cannot derive rich information from the complex environment, make proper decisions and behave correctly. Autonomous perception and smart control are always topics of interest in the UAV community. We focus on the active tracking task of UAVs. Active tracking for a dynamic target is a fundamental function for UAVs to perform monitoring and anti-terrorism operations in GPS-denied environments. Active tracking for a dynamic target is a fundamental function for UAVs to perform monitoring and anti-terrorism operations in GPS-denied environments This specific task requires both autonomous perception to determine the location of the dynamic target and control to actively track the target, which can be transferred and generalized to more difficult autonomous tasks

Methods

Results

Conclusion