Abstract

Depth maps have been proven profitable to provide supplements for salient object detection in recent years. However, most RGB-D salient object detection approaches ignore that there are usually low-quality depth maps, which will inevitably result in unsatisfactory results. In this paper, we propose a depth cue enhancement and guidance network (DEGNet) for RGB-D salient object detection by exploring the depth quality enhancement and utilizing the depth cue guidance to generate predictions with highlighted objects and suppressed backgrounds. Specifically, a depth cue enhancement module is designed to generate high-quality depth maps by enhancing the contrast between the foreground and the background. Then considering the different characteristics of unimodal RGB and depth features, we use different feature enhancement strategies to strengthen the representation capability of side-output unimodal features. Moreover, we propose a depth-guided feature fusion module to excavate depth cues provided by the depth stream to guide the fusion of multi-modal features by fully making use of different modal properties, thus generating discriminative cross-modal features. Besides, we aggregate cross-modal features at different levels to obtain the final prediction by adopting a pyramid feature shrinking structure. Experimental results on six benchmark datasets demonstrate that the proposed network DEGNet outperforms 17 state-of-the-art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call