A2TPNet: Alternate Steered Attention and Trapezoidal Pyramid Fusion Network for RGB-D Salient Object Detection

Songsong Duan,Bin Ge,Chenxing Xia,Xiuju Gao

doi:10.3390/electronics11131968

Songsong Duan, Bin Ge + Show 2 more

Open Access

https://doi.org/10.3390/electronics11131968

Copy DOI

Journal: Electronics	Publication Date: Jun 24, 2022
Citations: 4	License type: CC BY 4.0

Affiliation: Anhui University of Science and Technology

Abstract

RGB-D salient object detection (SOD) aims at locating the most eye-catching object in visual input by fusing complementary information of RGB modality and depth modality. Most of the existing RGB-D SOD methods integrate multi-modal features to generate the saliency map indiscriminately, ignoring the ambiguity between different modalities. To better use multi-modal complementary information and alleviate the negative impact of ambiguity among different modalities, this paper proposes a novel Alternate Steered Attention and Trapezoidal Pyramid Fusion Network (A2TPNet) for RGB-D SOD composed of Cross-modal Alternate Fusion Module (CAFM) and Trapezoidal Pyramid Fusion Module (TPFM). CAFM is focused on fusing cross-modal features, taking full consideration of the ambiguity between cross-modal data by an Alternate Steered Attention (ASA), and it reduces the interference of redundant information and non-salient features in the interactive process through a collaboration mechanism containing channel attention and spatial attention. TPFM endows the RGB-D SOD model with more powerful feature expression capabilities by combining multi-scale features to enhance the expressive ability of contextual semantics of the model. Extensive experimental results on five publicly available datasets demonstrate that the proposed model consistently outperforms 17 state-of-the-art methods.

Full Text