Abstract

With the advent of depth sensors, the use of RGB and depth(D) information for salient object detection (SOD) has been explored extensively in recent years. However, the depth quality from the different scenes usually varies, leading to fusing and achieving complementary between RGB and low-quality depth is still a challenging problem. In this paper, we first design a Double Dilated Merge Module (DDMM) to extract comprehensive and beneficial high-level cross-modality features and explore further global context information at multi-scales to obtain a coarse saliency map. Then, we propose a Cross-Modality Enhance Module (CMEM) to enhance cross-modality features compatibility and fuse them with the previous predicted coarse saliency map to generate a more accurate saliency map. Furthermore, we introduce a Distribution-Region Combination Loss (DRcom Loss) to optimize our proposed Double Cross-Modality Progressively Guided Network (DCPGNet) in a coarse-to-fine manner. DCPGNet achieves satisfactory performance on five public benchmarks compared with recent state-of-the-art algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call