Abstract
Indoor RGB-D semantic segmentation is a new and challenging problem. Traditional methods usually apply two-stream convolutional neural networks (CNNs) to represent RGB and depth images respectively, and fuse the two streams on a specific layer. In this paper, we explore several fusion strategies based on this two-stream-CNN framework and point out such a single-layer fusion method cannot exploit the complementary RGB and depth cues well for semantic segmentation. To address this problem, we propose a novel Semantics-guided Multi-level feature fusion approach, which first learns deep feature representation from bottom to up, and then gradually fuses the RGB and depth features from high level to low level under the guidance of the semantic cues. Experimental results on SUN RGB-D dataset demonstrate the advantages of the proposed method over the state of the arts.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.