Abstract

RGBT tracking is rapidly developing due to its complementary advantages of RGB and thermal frames. Existing methods with high accuracy track at a lower speed, and do not make full use of the hierarchical information in the feature extraction and the historical information of the sequences. To address these issues, a novel dual-modality space-time memory (DMSTM) network is proposed for robust RGBT tracking. Specifically, DMSTM is divided into three modules. The first module is the dual-modality backbone that utilizes both shallow and deep information by aggregating feature maps of dimensional changes during downsampling. Another module is the space-time memory reader with bimodal fusion. It aggregates features of historical and current frames to share information in the time domain. The last module is the siamese head network, which computes the predicted loss sum of the two modalities and back-propagates it. This avoids degrading the tracking performance due to sequence frame pairs where the training targets are not perfectly aligned. Extensive experiments on three RGBT benchmark datasets show that the performance and efficiency of the proposed DMSTM exceed that of state-of-the-art methods while running at 27.6 FPS.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.