Salient Object Detection in Stereoscopic 3D Images Using a Deep Convolutional Residual Autoencoder

Wujie Zhou,Jenq-Neng Hwang,Junwei Wu,Lu Yu,Jingsheng Lei

doi:10.1109/tmm.2020.3025166

Abstract

In recent years, the detection of distinctive objects in stereoscopic 3D images has drawn increasing attention. Unlike 2D salient object detection, salient object detection in stereoscopic 3D images is highly challenging. Hence, we propose a novel Deep Convolutional Residual Autoencoder (DCRA) for end-to-end salient object detection in stereoscopic 3D images. The core trainable architecture of the salient object detection model employs raw stereoscopic 3D images as the inputs and their corresponding ground truth saliency masks as the labels. A convolutional residual module is applied to both the encoder and the decoder as a basic building block in the DCRA, and long-range skip connections are employed to bypass the equal-sized feature maps between the encoder and the decoder. To explore the complex relationships and exploit the complementarity between RGB (photometric) and depth (geometric) information, multiple feature map fusion modules are constructed. These modules integrate texture and structure information between the RGB and depth branches of the encoder and fuse their features over several multiscale layers. Finally, to efficiently optimize DCRA parameters, a supervision pyramid based on boundary loss and background prior loss is adopted, which employs supervised learning over the multiscale layers in the decoder to prevent vanishing gradients and accelerate the training at the fusion stage. We compare the proposed DCRA with state-of-the-art methods on two challenging benchmark datasets. The results of these experiments demonstrate that our proposed DCRA performs favorably against the comparison models.

Full Text