Abstract

Recent salient object detection methods are mainly based on Convolutional Neural Networks (CNNs). Most of them adopt a U-shape architecture to extract and fuse multi-scale features. The coarser-level semantic information is progressively transmitted to finer-level layers through continuous upsampling operations, and the coarsest-level features will be diluted, resulting in the salient object boundary being blurred. On the other hand, the hand-craft feature has the advantage of being purposive and easy to calculate, in which the edge density feature may help improve the sharpness of the salient object boundary by the rich edge information. In this paper, we propose a Co-Guided Attention Network (CoGANet). On the base of the Feature Pyramid Network (FPN), our model implements a co-guided attention mechanism between the image itself and its edge density feature. In the bottom-up pathway of FPN, two streams separately work, taking the original image and its edge density feature as inputs, and each producing five feature maps. Then the last feature map in each stream generates a set of attention maps through a Multi-scale Spatial Attention Module (MSAM). In the top-down pathway, the attention maps of one stream are directly delivered to each stage in the other stream. These attention maps are fused with the feature maps by an Attention-based Feature Fusion Module (AFFM). Finally, an accurate saliency map is produced by fusing the finest-level outputs of the two streams. Experimental results on five benchmark datasets demonstrate our model is superior to 13 state-of-the-art methods in terms of four evaluation metrics.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call