Cross-modal feature extraction and integration based RGBD saliency detection

Liang Pan,Xiaofei Zhou,Ran Shi,Jiyong Zhang,Chenggang Yan

doi:10.1016/j.imavis.2020.103964

Abstract

In RGBD saliency detection research field, RGB and depth cues are generally given the same status by RGBD saliency models. However, they ignore that both modalities are significantly different in inherent attribution so that effective features cannot be drawn from depth maps. In order to address this issue, this paper proposes a novel RGBD saliency model including two key components: the contrast-guided depth feature extraction (CDFE) module and the cross-modal feature integration (CFI) module. Specifically, considering the specific properties of depth information, we first design a targeted CDFE module, which learns multi-level deep depth features by strengthening the depth contrast between foreground and background, to provide multi-level deep depth features. Then, to sufficiently and reasonably integrate multi-level cross-modal features, namely the multi-level deep RGB and depth features, we equip the saliency inference branch with the CFI module, which contains two successive steps, i.e. information enrichment and feature enhancement. Extensive experiments are conducted on five challenging RGBD datasets, and the experimental results clearly demonstrate the effectiveness and superiority of the proposed model against the state-of-the-art RGBD saliency models.

Full Text