RGB-D salient object detection based on multimodal feature information fusion

Weiwei Duan,Xuehan Shi,Fei Cheng,Lingbing Meng,Tao Zhang,Qingqing Liu,Ting Yang,Mengya Yuan,Lingli Li

doi:10.1117/12.2659990

Abstract

The key to RGB-D salient object detection is the effective fusion of the different modal features of RGB and depth maps. This study proposes an RGB-D salient object detection method based on multimodal feature information fusion. First, in the encoding stage, essential features from the depth map were extracted using the spatial and channel attention modules and then merged with RGB feature information to improve the expression ability of salient objects. Second, in the decoding stage, a multimodal and multilevel feature fusion module and a global context-feature guidance module were proposed to optimize the detection effect of the network on the salient objects of missing detection and error detection, which can more accurately decode the spatial structure information of multiple objects and small objects. Compared with 15 other deep learning detection methods, the experimental results on four datasets show that our method overcomes the comparison methods on multiple evaluation metrics.

Full Text