Existing visual object tracking methods only use the target area of the first frame as a template, which makes it easy to fail in fast-changing and complex backgrounds. To solve this problem, this paper proposes a Transformer-based object tracking algorithm that focuses on the target focus information in the template and dynamically updates the template features. In order to reduce the interference of background information on attention, this algorithm uses a sparse Transformer module to achieve the interaction of feature information; a template focus attention module is also proposed to focus on the dynamic template features and the initial template features to retain the highly reliable feature information in the initial template. Experimental results show that the success rate and accuracy of this algorithm in the 0TB100 benchmark test are and respectively 70.9%, 91.6%and the success rate and accuracy are improved by 4.11%and respectively compared with the similar template update algorithm STARK 3.27%. This algorithm effectively addresses the limitations of existing visual object tracking methods and improves the accuracy and robustness of tracking.
Read full abstract