Abstract

It is well-acknowledged that depth maps contain affluent spatial information which is crucial to explicitly distinguish the foreground and background in Salient Object Detection (SOD). With the help of depth maps, the performance has been pushed to the peak in SOD. Nevertheless, some depth maps with low quality are not potent for capturing accurate spatial information. Hence, it is not desirable to utilize depth maps indiscriminately. To this end, we propose a Discriminant and Cross-Modality Network (DCMNet) for RGB-D salient object detection. In DCMNet, we integrate a module named Depth Decomposition and Recomposition Module (DDRM) to filter depth maps with low quality. Thereafter, we conduct a quality enhancement procedure towards these detrimental depth maps. Meanwhile, we propose a Multi-Cross Attention Module (MCAM), which combines spatial attention with channel attention in a multi-cross way for better exploiting rich details about the salient object from RGB-stream and depth-stream. In addition, we employ Res2Net model to efficiently excavate foreground information and it is named as Image Pretraining Model (IPM). By embedding DDRM, MCAM and IPM, the accuracy has increased by a large margin. Extensive experiments manifest our proposed approach (DCMNet) outperforms the other 14 state-of-the-art methods on five challenging public datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call