Abstract

Semantic segmentation is a crucial task in vision measurement systems that involves understanding and segmenting different objects and regions within an image. Over the years, numerous RGB-D semantic segmentation methods have been developed, leveraging the encoder-decoder architecture to achieve outstanding performance. However, existing methods have two main problems that constrain further performance improvement. Firstly, in the encoding stage, existing methods have a weak ability to fuse cross-modal information, and low-quality depth maps can easily lead to poor feature representation. Secondly, in the decoding stage, the upsampling of high-level semantic information may cause the loss of contextual information, and low-level features from the encoder may bring noises to the decoder through skip connections. To solve these issues, we propose a novel Encoding Fusion and Decoding Correction Network (EFDCNet) for RGB-D indoor semantic segmentation. First, in the encoding stage of EFDCNet, we focus on extracting valuable information from low-quality depth maps, and employ a channel-wise filter to select informative depth features. Additionally, we establish the global dependencies between RGB and depth features via the self-attention mechanism to enhance the cross-modal feature interactions, extracting discriminant and powerful features. Then, in the decoding stage of EFDCNet, we use the highest-level information as semantic guidance to compensate for the upsampling information and filter out noise from the low-level encoder features propagated through the skip connections to the decoder. Extensive experiments conducted on two widely-used RGB-D indoor semantic segmentation datasets demonstrate that the proposed EFDCNet surpasses the performance of relevant state-of-the-art methods. The code is available at https://github.com/ Mark9010/EFDCNet

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.