ParallelTracker: A Transformer Based Object Tracker for UAV Videos

Haoran Wei,Shunping Ji,Gang Wan

doi:10.3390/rs15102544

Abstract

Efficient object detection and tracking from remote sensing video data acquired by unmanned aerial vehicles (UAVs) has significant implications in various domains, such as scene understanding, traffic surveillance, and military operations. Although the modern transformer-based trackers have demonstrated superior tracking accuracy, they often require extensive training time to achieve convergence, and the information from templates is not fully utilized in and integrated into tracking. To accelerate convergence and further improve tracking accuracy, we propose an end-to-end tracker named ParallelTracker that extracts prior knowledge from templates for better convergence and enriches template features in a parallel manner. Our core design incorporates spatial prior knowledge into the tracking process through three modules: a prior knowledge extractor module (PEM), a template features parallel enhancing module (TPM), and a template prior knowledge merge module (TPKM). These modules enable rich and discriminative feature extraction as well as integration of target information. We employ multiple PEM, TPM and TPKM modules along with a localization head to enhance accuracy and convergence speed in object tracking. To enable efficient online tracking, we also design an efficient parallel scoring prediction head (PSH) for selecting high-quality online templates. Our ParallelTracker achieves state-of-the-art performance on the UAV tracking benchmark UAV123, with an AUC score of 69.29%, surpassing the latest OSTrack and STARK methods. Ablation studies further demonstrate the positive impact of our designed modules on both convergence and accuracy.

Full Text