Abstract
In RGB-T salient object detection, effective utilization of the different characteristics of RGB and thermal modalities is essential to achieve accurate detection. Most of the previous methods usually only focus on reducing the differences between modalities, which may ignore the specific features that are crucial for salient object detection, leading to suboptimal results. To address the above issue, an RGB-T SOD network that simultaneously considers the reduction of modality differences and the preservation of specific features is proposed. Specifically, we construct a modality differences reduction and specific features preserving module (MDRSFPM) which aims to bridge the gap between modalities and enhance the specific features of each modality. In MDRSFPM, the dynamic vector generated by the interaction of RGB and thermal features is used to reduce modality differences, and then a dual branch is constructed to deal with the RGB and thermal modalities separately, employing a combination of channel-level and spatial-level operations to preserve their respective specific features. In addition, a multi-scale global feature enhancement module (MGFEM) is proposed to enhance global contextual information to provide guidance information for the subsequent decoding stage, so that the model can more easily localize the salient objects. Furthermore, our approach includes a fully fusion and gate module (FFGM) that utilizes dynamically generated importance maps to selectively filter and fuse features during the decoding process. Extensive experiments demonstrate that our proposed model surpasses other state-of-the-art models on three publicly available RGB-T datasets remarkably. Our code will be released at https://github.com/JOOOOKII/FRPNet.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have