Abstract

The past decade has witnessed great progress in RGB-D salient object detection (SOD). However, there are two bottlenecks that limit its further development. The first one is low-quality depth maps. Most existing methods directly use raw depth maps to perform detection, but low-quality depth images can bring negative impacts to the detection performance. Hence, it is not desirable to utilize depth maps indiscriminately. The other one is how to effectively predict salient maps with clear boundary and complete salient region. To address these problems, an Attention-Guided Multi-Modality Interaction Network (AMINet) is proposed. First, we propose a new quality enhancement strategy for unreliable depth images, named D epth E nhancement M odule ( DEM ). With respect to the second issue, we propose C ross- M odality A ttention M odule ( CMAM ) to rapidly locate salient region. The B oundary- A ware M odule ( BAM ) is designed to utilize high-level feature to guide the low-level feature generation in a top-down way to make up for the dilution of the boundary. To further improve the accuracy, we propose A trous R efined B lock ( ARB ) to adaptively compensate for the shortcoming of atrous convolution. By integrating these interactive modules, features from depth and RGB streams can be refined efficiently, which consequently boosts the detection performance. Experimental results demonstrate the proposed AMINet exceeds state-of-the-art (SOTA) methods on several public RGB-D datasets.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call