ECFFNet: Effective and Consistent Feature Fusion Network for RGB-T Salient Object Detection

Wujie Zhou,Lu Yu,Qinling Guo,Jenq-Neng Hwang,Jingsheng Lei

doi:10.1109/tcsvt.2021.3077058

Abstract

Under ideal environmental conditions, RGB-based deep convolutional neural networks can achieve high performance for salient object detection (SOD). In scenes with cluttered backgrounds and many objects, depth maps have been combined with RGB images to better distinguish spatial positions and structures during SOD, achieving high accuracy. However, under low-light and uneven lighting conditions, RGB and depth information may be insufficient for detection. Thermal images are insensitive to lighting and weather conditions, being able to capture important objects even during nighttime. By combining thermal images and RGB images, we propose an effective and consistent feature fusion network (ECFFNet) for RGB-T SOD. In ECFFNet, an effective cross-modality fusion module fully fuses features of corresponding sizes from the RGB and thermal modalities. Then, a bilateral reversal fusion module performs bilateral fusion of foreground and background information, enabling the full extraction of salient object boundaries. Finally, a multilevel consistent fusion module combines features across different levels to obtain complementary information. Comprehensive experiments on three RGB-T SOD datasets show that the proposed ECFFNet outperforms 12 state-of-the-art methods under different evaluation indicators.

Full Text