Abstract
Combination of RGB and thermal sensors has been proven to be useful for many vision applications. However, how to effectively fuse the information of two modalities remains a challenging problem. In this paper, we propose an Illumination-Aware Window Transformer (IAWT) fusion module to handle the RGB and thermal multi-modality fusion. Specifically, the IAWT fusion module adopts a window-based multi-modality attention combined with additional estimated illumination information. The window-based multi-modality attention infers dependency cross modalities within a local window, thus implicitly alleviate the problem caused by weakly spatial misalignment of the RGB and thermal image pairs within specific dataset. The introduction of estimated illumination feature enables the fusion module to adaptively merge the two modalities according to illumination conditions so as to make full use of the complementary characteristics of RGB and thermal images under different environments. Besides, our proposed fusion module is task-agnostic and data-specific, which means it can be used for different tasks with RGBT inputs. To evaluate the advances of the proposed fusion method, we embed the IAWT fusion module into different networks and conduct the experiments on various RGBT tasks, including pedestrian detection, semantic segmentation and crowd counting. Extensive results demonstrate the superior performance of our method.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Journal of Visual Communication and Image Representation
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.