Abstract

Infrared target tracking is a fundamental task and plays an important role in military, weapon guidance and other fields because of the superiority of infrared imaging system. However, it is a challenging task due to lower resolution, smaller size and less texture information of infrared targets. Although varieties of methods demonstrate promising performance on visual target tracking, they cannot directly perform well on infrared images. To address this issue, we propose an infrared target tracker based on siamese network, named InfTrans, consisting of three components: a feature extraction backbone, an encoder-decoder transformer and a prediction head. In the InfTrans, transformer architecture is employed to exploit spatial and temporal information effectively. Firstly, considering the superior ability in capturing long-range dependencies, encoder-decoder transformer is connected after backbone to capture relationship between template and search patch along spatial and temporal dimension. Secondly, considering the parallel computing power of transformer, InfTrans receives search patch sequences containing multiple patches as input of search branch and processes multiple frames in parallel. Finally, a prediction head is employed to output bounding box sequences. InfTrans views infrared target tracking as a parallel bounding box prediction problem. Extensive experiments show that the proposed tracker achieves promising performance on public infrared dataset.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call