Abstract

AbstractSingle target tracking based on computer vision helps to collect, analyse and exploit target information. The SwinTrack algorithm has received widespread attention as one of the twin network algorithms with the best trade‐off between tracking accuracy and speed, but it also suffers from the insufficient fusion of deep and shallow features leading to loss of shallow information and insufficient use of temporal information leading to inconsistency between target and template. Semantic information and detailed information are combined and multiple convolutional forms are introduced to propose a multi‐level feature fusion strategy to effectively fuse features in space. Besides, based on the idea of feedback, a dynamic template branching approach is also designed to fuse temporal features and enhance the representation of target features. The effectiveness of this method was verified on the OTB100 and GOT10K datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call