Abstract

Salient object detection has achieved great improvements by using the Fully Convolutional Networks (FCNs). However, the FCN-based U-shape architecture may cause dilution problems in the high-level semantic information during the up-sample operations in the top-down pathway. Thus, it can weaken the ability of salient object localization and produce degraded boundaries. To this end, in order to overcome this limitation, we propose a novel pyramid self-attention module (PSAM) and the adoption of an independent feature-complementing strategy. In PSAM, self-attention layers are equipped after multi-scale pyramid features to capture richer high-level features and bring larger receptive fields to the model. In addition, a channel-wise attention module is also employed to reduce the redundant features of the FPN and provide refined results. Experimental analysis demonstrates that the proposed PSAM effectively contributes to the whole model so that it outperforms state-of-the-art results over five challenging datasets. Finally, quantitative results show that PSAM generates accurate predictions and integral salient maps, which can provide further help to other computer vision tasks, such as object detection and semantic segmentation.

Highlights

  • Salient object detection or segmentation aims at identifying visually distinctive parts of a natural scene

  • We propose a novel pyramid self-attention module (PSAM) to overcome the limitation of feature dilution in the previous Fully Convolutional Networks (FCNs)-based approaches

  • Experimental results show that the proposed PSAM can improve the performance of salient object detection and achieve superior to state-of-the-art results in five challenging datasets

Read more

Summary

Introduction

Salient object detection or segmentation aims at identifying visually distinctive parts of a natural scene. The so-called Fully Convolutional Networks (FCNs) have recently become the fundamental framework for the above-mentioned problems [14,15,16], as FCNs can achieve fewer parameters and smaller flexible input size compared to the fully connected layer These works have achieved great improvements in performance, they are still restricted by some limitations. Electronics 2020, 9, 1702 to extract high-level semantic features These features facilitate the location of objects, during pooling operations, important information might be lost. We propose a novel pyramid self-attention module (PSAM) to overcome the limitation of feature dilution in the previous FCN-based approaches. Through incorporating a self-attention module with multi-scale feature maps of FPNs, the model will focus on the high-level features. Module (PSAM): an FPN structure and a proposed pyramid self-attention module (e) ours: the final model which contains both a PSAM and channel-wise attention modules

Salient Object Detection
Attention Mechanism
The Proposed Method
Pyramid Self-Attention Module
Channel-Wise Attention
Datasets and Evaluation Metrics
Impelmentation Details
Comparisons with State-of-the-Arts
Quantitative Comparisons
Qualitative Comparisons
Ablation Study
Findings
Discussion and Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call