FCMNet: Frequency-aware cross-modality attention networks for RGB-D salient object detection

Xiao Jin,Chunle Guo,Zhen He,Jing Xu,Yongwei Wang,Yuting Su

doi:10.1016/j.neucom.2022.04.015

Abstract

RGB-D saliency detection aims to comprehensively use RGB images and depth maps to detect object saliency. This field still faces two challenges: 1) how to extract representative multimodal features and 2) how to effectively fuse them. Most of the previous methods in this field equally treat RGB and depth information as two modalities, while not considering the difference in the frequency domain of the two modalities, and may lose some complementary information. In this paper, we introduce the frequency channel attention mechanism into the fusion process. First, we design a frequency-aware cross-modality attention (FACMA) module to interweave adequate channel features and select representative features. In the FACMA module, we also propose a spatial frequency channel attention (SFCA) module to introduce more complementary information in different channels. Second, we develop a weighted cross-modality fusion (WCMF) module to adaptively fuse multimodality features by learning the content-dependent weight maps. Comprehensive experiments on several benchmark datasets demonstrate that the proposed framework outperforms seventeen state-of-the-art methods.

Full Text