MS-IRTNet: Multistage information interaction network for RGB-T semantic segmentation

Zhiwei Zhang,Yisha Liu,Weimin Xue

doi:10.1016/j.ins.2023.119442

Zhiwei Zhang, Yisha Liu + Show 1 more

https://doi.org/10.1016/j.ins.2023.119442

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

The complementary information from RGB and thermal images can remarkably boost semantic segmentation performance. Existing RGB-T segmentation methods usually use simple interaction strategies to extract complementary information from RGB and thermal images, which ignores recognizability features from different imaging mechanisms. To address these problems, we propose a multistage information interaction network for RGB-T semantic segmentation called MS-IRTNet. MS-IRTNet has a dual-stream encoder structure that can extract multistage feature information. To better interact with multimodal information, we design a gate-weighted interaction module (GWIM) and a feature information interaction module (FIIM). GWIM can learn multimodal information weights in different channels, while FIIM integrates and fuses weighted RGB and thermal information into a single feature map. Finally, multistage interactive information is fed into the decoder for semantic prediction. Our method achieves 60.5 mIoU on the MFNet dataset, outperforming state-of-the-art methods. Notably, MS-IRTNet also achieved state-of-the-art results in tests of daytime images (51.7 mIoU) and nighttime images (62.5 mIoU). The code and pre-trained models are available at https://github.com/poisonzzw/MS-IRTNet.

Full Text