Abstract

2D saliency detection algorithms make nothing of 3D visual information of scenes, this leads to their poor performance in challenging scene images. Moreover, light field data contains rich 3D visual information, but existing CNN-based algorithms are specifically designed for processing 2D RGB images rather than light field images. To overcome these issues, in this paper, a cross-level feature aggregation and fusion network is proposed for Light Field Salient Object Detection. To make full use of 3D visual information, two stream sub-network are designed in parallel to handle all-focus images and depth maps separately. Then some feature aggregation modules are built to aggregate cross-level visual features to identify the salient objects in scene. In addition, many feature fusion modules are designed to fuse cross-modal features from all-focus images, focal stack and depth maps, which can highlight salient object consistently by utilizing of 3D visual information. Comprehensive experiments conducted on three benchmark datasets indicate that our algorithm outperforms state-of-the-art methods both quantitatively and qualitatively on five evaluation metrics.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.