Abstract
To identify the most visually salient regions in a set of paired RGB and depth maps, in this paper, we propose a multimodal feature fusion supervised RGB-D image saliency detection network, which learns RGB and depth data by two independent streams separately, uses a dual-stream side-supervision module to obtain saliency maps based on RGB and depth features for each layer of the network separately, and then uses a multimodal feature fusion module to fuse the latter 3 layers of RGB and depth high-dimensional information to generate high-level significant prediction results. Experiments on three publicly available datasets show that the proposed network outperforms the current mainstream RGB-D saliency detection models with strong robustness due to the use of a dual-stream side-surveillance module and a multimodal feature fusion module. We use the proposed RGB-D SOD model for background defocusing in realistic scenes and achieve excellent visual results.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.