Abstract
The use of depth information, acquired by depth sensors, for salient object detection (SOD) is being explored. Despite the remarkable results from recent deep learning approaches for RGB-D SOD, they fail to fully incorporate original and accurate information to express details of RGB-D images in salient objects. Here, we propose an RGB-D SOD model using a three-input multilevel fusion network (TMFNet), which differs from existing methods based on double-stream networks. In addition to RGB input (first input) and depth input (second input), the RGB image and depth map are combined into a four-channel representation (RGBD input) that constitutes the third input to the TMFNet. The RGBD input generates multilevel features that reflect details of the RGB-D image. In addition, the proposed TMFNet aggregates diverse region-based contextual information without discarding RGB and depth features. Thus, we introduce a cross-fusion module, and benefiting from rich low- and high-level information from salient features, feature fusion enables the improvement of localization of salient objects. The proposed TMFNet achieves state-of-the-art performance on six benchmark datasets for SOD.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Emerging Topics in Computational Intelligence
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.