Abstract

Deep convolutional neural networks (CNNs) have been widely applied to saliency object detection (SOD) with promising performance. However, the coarse feature resolutions due to a series of pooling and convolutional operations in CNNs considerably decrease the original image resolution, resulting in losing spatial details and fine structures, especially on salient object boundaries. To address this issue, we present a recurrent reverse attention based residual learning network for SOD. We first construct a pair of low-and high-level integrated feature representations by aggregating two groups of low-and high-level feature maps, respectively, among which we design a novel joint feature pyramid pooling module to increase the high-level feature resolution. Afterwards, we progressively learn to refine the residual between each side-output saliency prediction and the ground-truth in a cascaded fashion by alternatively using the high-level and low-level integrated features. Each residual learning module contains a recurrent reverse attention module that focuses on the residual region outside the current predicted salient regions, guiding the network to learn the missing salient parts and details. Finally, we develop a simple yet effective boundary detection loss to compensate the detailed texture loss. Extensive evaluations on six popular SOD benchmark datasets demonstrate remarkable performance boosting of our proposed approach compared with state-of-the-art methods. Especially, our approach runs in real-time with a speed of 32 fps.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call