Abstract

Convolutional neural networks have been widely applied in saliency detection task because of its powerful feature extraction capability. Most of existing saliency detection models have achieved great progress by aggregating the strong multi-level features. However, it is still a challenging task to design the feature fusing strategy because of the various differences between multi-level features. In this paper, we explore the effect of cascaded pooling operations for saliency detection and propose a novel network to decode saliency cues from multi-level features progressively. We refer to the architecture as “cascaded hourglass” feature fusing network. The proposed network equips with three cascaded sub-modules to capture the multi-scale context and integrate multi-level features progressively. Specifically, we first propose a multi-scale context-aware feature extraction block with different dilated convolutional branches to obtain multi-scale context-aware saliency cues. Then, a hourglass feature fusing block with successive steps of pooling operations is applied to convert the features to multiple feature spaces. Furthermore, we stack a serial of the hourglass feature fusing blocks to purify the multi-level coarse features progressively. Finally, we combine the selective features with cascaded feature decoder to produce final saliency map. Extensive experiments demonstrate the proposed network compares favorably against state-of-the-art methods. Additionally, our model is efficient with the real-time speed of 28 FPS when processing a 400×300 image.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call