RGB-D Saliency Detection based on Cross-Modal and Multi-scale Feature Fusion

Lei Zhu,Xuxing Zhu,Jin Wu

doi:10.1109/ccdc55256.2022.10033926

Abstract

The mainstream algorithm brings noise from the depth map, and the high-level features are diluted in the fusion process. To address these problems, RGB-D saliency detection based on Cross-Modal and Multi-Scale Feature Fusion is proposed. In this paper, we propose Cross-Modal Feature Fusion Module(CMFFM) to fuse the RGB image with semantic feature advantages and the depth map with position feature advantages. Based on this, CMFFM effectively suppress the interference of noise in the depth map. Furthermore, Multi-Scale Residual Channel Attention Feature Fusion Module(MRCAFFM) is proposed to fuse the high-level and low-level features level-by-level, which enriches the expression of high-level semantic features and enhances the capability of feature selection. Finally, experimental results on four benchmark datasets show that the comprehensive performance of the algorithm is better than the compared algorithms.

Full Text