TMFNet: Three-Input Multilevel Fusion Network for Detecting Salient Objects in RGB-D Images

Wujie Zhou,Jingsheng Lei,Sijia Pan,Lu Yu

doi:10.1109/tetci.2021.3097393

Abstract

The use of depth information, acquired by depth sensors, for salient object detection (SOD) is being explored. Despite the remarkable results from recent deep learning approaches for RGB-D SOD, they fail to fully incorporate original and accurate information to express details of RGB-D images in salient objects. Here, we propose an RGB-D SOD model using a three-input multilevel fusion network (TMFNet), which differs from existing methods based on double-stream networks. In addition to RGB input (first input) and depth input (second input), the RGB image and depth map are combined into a four-channel representation (RGBD input) that constitutes the third input to the TMFNet. The RGBD input generates multilevel features that reflect details of the RGB-D image. In addition, the proposed TMFNet aggregates diverse region-based contextual information without discarding RGB and depth features. Thus, we introduce a cross-fusion module, and benefiting from rich low- and high-level information from salient features, feature fusion enables the improvement of localization of salient objects. The proposed TMFNet achieves state-of-the-art performance on six benchmark datasets for SOD.

Full Text