Dual-Modality Space-Time Memory Network for RGBT Tracking

Fan Zhang,Yuqian Zhao,Hanwei Peng,Lingli Yu,Baifan Chen

doi:10.1109/tim.2023.3282668

Fan Zhang, Yuqian Zhao + Show 3 more

https://doi.org/10.1109/tim.2023.3282668

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

RGBT tracking is rapidly developing due to its complementary advantages of RGB and thermal frames. Existing methods with high accuracy track at a lower speed, and do not make full use of the hierarchical information in the feature extraction and the historical information of the sequences. To address these issues, a novel dual-modality space-time memory (DMSTM) network is proposed for robust RGBT tracking. Specifically, DMSTM is divided into three modules. The first module is the dual-modality backbone that utilizes both shallow and deep information by aggregating feature maps of dimensional changes during downsampling. Another module is the space-time memory reader with bimodal fusion. It aggregates features of historical and current frames to share information in the time domain. The last module is the siamese head network, which computes the predicted loss sum of the two modalities and back-propagates it. This avoids degrading the tracking performance due to sequence frame pairs where the training targets are not perfectly aligned. Extensive experiments on three RGBT benchmark datasets show that the performance and efficiency of the proposed DMSTM exceed that of state-of-the-art methods while running at 27.6 FPS.

Full Text