AMCFNet: Asymmetric multiscale and crossmodal fusion network for RGB-D semantic segmentation in indoor service robots

Wujie Zhou,Yuchun Yue,Meixin Fang,Shanshan Mao,Rongwang Yang,Lu Yu

doi:10.1016/j.jvcir.2023.103951

Abstract

Red-green-blue and depth (RGB-D) semantic segmentation is essential for indoor service robots to achieve accurate artificial intelligence. Various RGB-D indoor semantic segmentation methods have been proposed since the widespread adoption of depth maps. These methods have focused mainly on integrating the multiscale and crossmodal features extracted from RGB images and depth maps in the encoder and used unified strategies to recover the local details at the decoder progressively. However, these methods emphasized crossmodal fusion at the encoder, neglecting the distinguishability between RGB and depth features during decoding, thereby undermining the segmentation performance. To adequately exploit the features, we propose an efficient encoder-decoder architecture called asymmetric multiscale and crossmodal fusion network (AMCFNet) endowed with a differential feature integration strategy. Unlike existing methods, we use simple crossmodal fusion at the encoder and design an elaborate decoder to improve the semantic segmentation performance. Specifically, considering high- and low-level features, we propose a semantic aggregation module (SAM) to process the multiscale and crossmodal features in the last three network layers to aggregate high-level semantic information through a cascaded pyramid structure. Moreover, we design a spatial detail supplement module using low-level spatial details from depth maps to adaptively fuse these details and the information obtained from the SAM. Extensive experiments are conducted to demonstrate that the proposed AMCFNet outperforms state-of-the-art approaches.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

AMCFNet: Asymmetric multiscale and crossmodal fusion network for RGB-D semantic segmentation in indoor service robots

Abstract

Talk to us

Similar Papers

More From: Journal of Visual Communication and Image Representation

Lead the way for us

Journal: Journal of Visual Communication and Image Representation	Publication Date: Oct 16, 2023
Citations: 2

Similar Papers

PGGNet: Pyramid gradual-guidance network for RGB-D indoor scene semantic segmentation
Wujie Zhou ... Lu Yu
Signal Processing: Image Communication | VOL. 128
Wujie Zhou, et. al.Wujie Zhou ... Lu Yu
22 Jun 2024
Signal Processing: Image Communication | VOL. 128

THCANet: Two-layer hop cascaded asymptotic network for robot-driving road-scene semantic segmentation in RGB-D images
Gao Xu ... Lu Yu
Digital Signal Processing | VOL. 136
Gao Xu, et. al.Gao Xu ... Lu Yu
17 Mar 2023
Digital Signal Processing | VOL. 136

EFDCNet: Encoding fusion and decoding correction network for RGB-D indoor semantic segmentation
Jianlin Chen ... Dan Zeng
Image and Vision Computing | VOL. 142
Jianlin Chen, et. al.Jianlin Chen ... Dan Zeng
26 Dec 2023
Image and Vision Computing | VOL. 142

Multi-Level and Multi-Scale Feature Aggregation Network for Semantic Segmentation in Vehicle-Mounted Scenes.
Yong Liao ... Qiong Liu
Sensors (Basel, Switzerland) | VOL. 21
Yong Liao, et. al.Yong Liao ... Qiong Liu
09 May 2021
Sensors (Basel, Switzerland) | VOL. 21

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

AMCFNet: Asymmetric multiscale and crossmodal fusion network for RGB-D semantic segmentation in indoor service robots

Abstract

Talk to us

Similar Papers

More From: Journal of Visual Communication and Image Representation