Abstract

RGB-T salient object detection (SOD) aims to detect and segment saliency regions on RGB images and the corresponding thermal maps. The ability of alleviating the modality difference between RGB and thermal modality plays a vital role in the development of RGB-T SOD. However, most of the existing methods try to integrate multi-modal information through various fusion strategies, or reduce the modality difference via unidirectional or undifferentiated bidirectional interaction, but failing in some challenging scenes. To deal with the above question, a novel Cross-Modality Double Bidirectional Interaction and Fusion Network (CMDBIF-Net) for RGB-T SOD is proposed. Specifically, we construct an interactive branch to indirectly bridge the RGB and thermal modalities. In addition, we propose a double bidirectional interaction (DBI) module composed of a forward interaction block (FIB) and a backward interaction block (BIB) to reduce the cross-modality differences. Moreover, a multi-scale feature enhancement and fusion (MSFEF) module is introduced to integrate the multi-modal features with considering the internal gap of different modality. Finally, we use a cascaded decoder and a cross-level feature enhancement (CLFE) module to generate high-quality saliency map. Extensive experiments are conducted on three publicly available RGB-T SOD datasets shows that the proposed CMDBIF-Net achieves outstanding performance against the state-of-the-art (SOTA) RGB-T SOD methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call