Abstract

Benefiting from deep learning, Salient Object Detection (SOD) has made much progress. However, most existing methods adopt the same strategy to extract salient cues from different feature levels without fully considering their differences in the feature extraction stage and/or suffer from the accumulation of noise and dilution of spatial details in the feature fusion stage. These two problems hinder the further improvement in performance. In this paper, we propose an effective SOD model, PiNet, which can address the above problems via two novel mechanisms in the network: level-specific feature extraction and progressive refinement of saliency. We have designed the customized feature extraction components for each level of features—enabling us to extract better saliency cues from multi-level features. The saliency feature refinement in the branches follows a coarse-to-fine process, where the refined features progressively contain more location cues, internal and boundary details. Through short connections, the extracted saliency cues in different branches are selectively transmitted and integrated, which well mitigates the accumulation of noisy information and the dilution of detailed information. By using four different backbones, we verify our model has good adaptability and can make accurate saliency predictions under different pretrained models. Extensive experiments on five public datasets demonstrate that PiNet outperforms 19 state-of-the-art (SOTA) methods in SOD, with its small model size (56.1 MB) and fast inference speed (47 FPS).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call