Abstract

Most existing RGB and thermal (RGB-T) salient object detection (SOD) techniques focus on investing multi-modality feature fusion strategies for capturing cross-modality complementary information within RGB and thermal images. However, most of these strategies do not allow explicitly extracting the interactions among the features of different modalities, thus leading to insignificant cross-modality complementary information exploitation. In this paper, we propose a novel RGB-T SOD model that alleviates this issue by leveraging a modality-aware and scale-aware feature fusion module. Such a module is capable of capturing the cross-modality complementary information by exploiting the interactions of single-modality features across modalities and the interactions of multi-modality features across scales. A stage-wise feature aggregation module is also proposed to thoroughly exploit the cross-level complementary information and reduce their redundancies for generating accurate saliency maps with sharp boundaries. To this end, a novel multi-level feature aggregation structure with two types of feature aggregation nodes is employed. Experimental results on several benchmark datasets verify the effectiveness and superiorities of our proposed model over some state-of-the-art models.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call