Abstract

The key issue of thermal infrared tracking is to use neural networks to represent the target effectively and efficiently in the thermal infrared domain. The lack of thermal infrared trainable datasets makes it difficult to train a robust infrared object tracker from scratch, and the time-consuming convolution operations also make the tracking slow. To address the above problems, we proposed cross-modal compression distillation to represent thermal infrared objects for tracking, by leveraging an off-the-shelf RGB model with knowledge distillation. Specifically, cross-modal distillation is performed to effectively transfer knowledge from RGB modality to thermal infrared modality by inputting paired RGB and thermal infrared images into two branches of a Siamese network. Additionally, based on the teacher–student model architecture, the feature extractor is compressed into a lightweight model by model pruning and multi-level deep feature matching. Experimental results on LSOTB-TIR and PTB-TIR datasets show that the thermal infrared object tracking models distilled by our proposed method achieved faster tracking speed with better performance than the baseline RGB tracker by gaining an improvement of 1.5% Success Rate, 2.2% Precision, and 1.9% Normalized Precision, 58 frames per second (FPS) on LSOTB-TIR dataset, respectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.