Multi-Level Fusion for Robust RGBT Tracking via Enhanced Thermal Representation

Zhangyong Tang,Tianyang Xu,Josef Kittler,Xiao-Jun Wu

doi:10.1145/3678176

Abstract

Due to the limitations of visible (RGB) sensors in challenging scenarios, such as nighttime and foggy environments, the thermal infrared (TIR) modality draws increasing attention as an auxiliary source for robust tracking systems. Currently, the existing methods extract both the RGB and TIR clues in a similar approach, i.e. , utilising RGB-pretrained models with or without finetuning, and then aggregate the multi-modal information through a fusion block embedded in a single level. However, the different imaging principles of RGB and TIR data raise questions about the suitability of RGB-pretrained models for thermal data. In this paper, it is argued that the modality gap is overlooked, and an alternative training paradigm is proposed for TIR data to ensure consistency between the training and test data, which is achieved by optimising the TIR feature extractor with only TIR data involved. Furthermore, with the goal of making better use of the enhanced thermal representations, a multi-level fusion strategy is inspired by the observation that various fusion strategies at different levels can contribute to a better performance. Specifically, fusion modules at both the feature and decision levels are derived for a comprehensive fusion procedure while the pixel-level fusion strategy is not considered due to the misalignment of multi-modal image pairs. The effectiveness of our method is demonstrated by extensive qualitative and quantitative experiments conducted on several challenging benchmarks. Code will be released at https://github.com/Zhangyong-Tang/MELT .

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Multi-Level Fusion for Robust RGBT Tracking via Enhanced Thermal Representation

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Multimedia Computing, Communications, and Applications

Lead the way for us

Similar Papers

Attack-Defending Contrastive Learning for Volumetric Medical Image Zero-Watermarking
Xiyao Liu ... Shichao Zhang
ACM Transactions on Multimedia Computing, Communications, and Applications | VOL. -
Xiyao Liu, et. al.Xiyao Liu ... Shichao Zhang
05 Nov 2024
ACM Transactions on Multimedia Computing, Communications, and Applications | VOL. -

Diversity-Representativeness Replay and Knowledge Alignment for Lifelong Vehicle Re-identification
Anqi Cao ... Xin Xu
ACM Transactions on Multimedia Computing, Communications, and Applications | VOL. -
Anqi Cao, et. al.Anqi Cao ... Xin Xu
05 Nov 2024
ACM Transactions on Multimedia Computing, Communications, and Applications | VOL. -

Joint Mixing Data Augmentation for Skeleton-based Action Recognition
Linhua Xiang ... Zengfu Wang
ACM Transactions on Multimedia Computing, Communications, and Applications | VOL. -
Linhua Xiang, et. al.Linhua Xiang ... Zengfu Wang
05 Nov 2024
ACM Transactions on Multimedia Computing, Communications, and Applications | VOL. -

TEC-CNN: Towards Efficient Compressing Convolutional Neural Nets with Low-rank Tensor Decomposition
Yifan Wang ... Jie Li
ACM Transactions on Multimedia Computing, Communications, and Applications | VOL. -
Yifan Wang, et. al.Yifan Wang ... Jie Li
05 Nov 2024
ACM Transactions on Multimedia Computing, Communications, and Applications | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multi-Level Fusion for Robust RGBT Tracking via Enhanced Thermal Representation

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Multimedia Computing, Communications, and Applications