Abstract
Existing deployed Unmanned Aerial Vehicles (UAVs) visual trackers are usually based on the correlation filter framework. Although these methods have certain advantages of low computational complexity, the tracking performance of small targets and fast motion scenarios is not satisfactory. In this paper, we present a novel multi-level prediction Siamese network (MLPS) for object tracking in UAV videos, which consists of Siamese feature extraction module and multi-level prediction module. The multi-level prediction module can make full use of the characteristics of each layer features to achieve robust evaluation of targets with different scales. Meanwhile, for small-size target tracking, we design a residual feature fusion block, which is used to constrain the low-level feature representation by using high-level abstract semantics, and obtain the improvement of the tracker's ability to distinguish scene details. In addition, we propose a layer attention fusion block which is sensitive to the informative features of each layers to achieve adaptive fusion of different levels of correlation responses by dynamically balancing the multi-layer features. Sufficient experiments on several UAV tracking benchmarks demonstrate that MLPS achieves state-of-the-art performance and runs at a speed over 97 FPS.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.