Multi-type and Multi-level Feature Fusion Network for RGBD Indoor Semantic Segmentation

Yuwen Xia,Kaijie Wu,Chaochen Gu

doi:10.1109/ccdc55256.2022.10033547

Abstract

RGBD images, which refer to the three-channel color images augmented with depth channel, have been greatly exploited to improve the performance of scene semantic segmentation. RGBD semantic segmentation task is of great significance to help the development of robot navigation and grasping. However, most existing methods addressing RGBD semantic segmentation lack effective feature fusion modules. To address the aforementioned problem, we propose a novel network with two novel modules for feature fusion, namely Channel-Attention based Complementary Feature Fusion Module (CAC-FFM) and Cross-Layer Feature Fusion Module (CL-FFM). Specifically, CAC-FFM, which is essentially based on the channel-attention mechanism, effectively utilizes the complementary information of RGB and depth to generate fusion features, and CL-FFM captures patch-wise features from low-level feature maps to assist the training of high-level features so as to further optimize the segmentation results. Experimental results on the publicly available NYUDv2 dataset validate the effectiveness and superiority of our proposed method.

Full Text