Combining Focus Cues and Depth Cues for Light Field Salient Object Detection

Xin Wang,Gaomin Xiong,Jun Gao

doi:10.1109/dsit55514.2022.9943911

Abstract

Light field cameras record the spatial information and angle information of a scene at the same time. The array of micro-lens added in front of the sensor makes it possible to capture sub-aperture images as well as refocus images. Thus, focus cues and depth cues are available simultaneously by only one shoot. However, previous deep learning based light field salient object detection (SOD) methods only extract features from one of them (focus cues or depth cues). In this paper, we propose a new deep convolutional network to realize SOD by using both focal stacks and depth maps. We construct the FSNet based on 3D octave convolution blocks to extract and retain the continuously changing focus cues in the focal stack, and the DepthNet to extract depth cues in the depth map. Then, the focal stack features and depth features are combined for the further SOD task. Experiments are conducted on three benchmark light field SOD datasets, and the results show that our method is effective.

Full Text