Multiscale multilevel context and multimodal fusion for RGB-D salient object detection

Junwei Wu,Wujie Zhou,Ting Luo,Lu Yu,Jingsheng Lei

doi:10.1016/j.sigpro.2020.107766

Abstract

Red–green–blue and depth (RGB-D) saliency detection has recently attracted much research attention; however, the effective use of depth information remains challenging. This paper proposes a method that leverages depth information in clear shapes to detect the boundary of salient objects. As context plays an important role in saliency detection, the method incorporates a proposed end-to-end multiscale multilevel context and multimodal fusion network (MCMFNet) to aggregate multiscale multilevel context feature maps for accurate saliency detection from objects of varying sizes. Finally, a coarse-to-fine approach is applied to an attention module retrieving multilevel and multimodal feature maps to produce the final saliency map. A comprehensive loss function is also incorporated in MCMFNet to optimize the network parameters. Extensive experiments demonstrate the effectiveness of the proposed method and its substantial improvement over state-of-the-art methods for RGB-D salient object detection on four representative datasets.

Full Text