Abstract

Salient Object Detection aims to detect the most visually distinctive objects in an image. We solve this problem by introducing the average pool to explore the multi-level deep average pool convolution features different from the max pool information. Based on the U-net structure, we propose an Average- and Max-Pool Network (AMPNet) that leverages the average- and max-pool modules to integrate the multi-level complementary contextual features in the spatial and channel-wise dimensions, respectively. The complementary contextual features generated by our network can improve the completeness of detected objects. It has been observed that the non-salient regions are misrecognized as the salient objects because of the redundant information contained in the multi-level convolution features. To address the problem, two top-down feedback paths are introduced based on the above two modules, and their top-level semantic guidance information is fully utilized to improve the accuracy of salient objects detection. Finally, we apply the Feature Fusion Module and Deep Supervision Mechanism to further improve the performance of the network over different datasets. Experimental results on six benchmark datasets show that our network is on par with state-of-the-art approaches. Our method runs at more than 45 FPS (based on VGG) and 35 FPS (based on ResNet) on a single GPU and meets real-time requirements.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call