Target tracking techniques in the UAV perspective utilize UAV cameras to capture video streams and identify and track specific targets in real-time. Deep learning UAV target tracking methods based on the Siamese family have achieved significant results but still face challenges regarding accuracy and speed compatibility. In this study, in order to refine the feature representation and reduce the computational effort to improve the efficiency of the tracker, we perform feature fusion in deep inter-correlation operations and introduce a global attention mechanism to enhance the model's field of view range and feature refinement capability to improve the tracking performance for small targets. In addition, we design an anchor-free frame-aware feature modulation mechanism to reduce computation and generate high-quality anchors while optimizing the target frame refinement computation to improve the adaptability to target deformation motion. Comparison experiments with several popular algorithms on UAV tracking datasets, such as UAV123@10fps, UAV20L, and DTB70, show that the algorithm balances speed and accuracy. In order to verify the reliability of the algorithm, we built a physical experimental environment on the Jetson Orin Nano platform. We realized a real-time processing speed of 30 frames per second.
Read full abstract