Abstract

Deep salient object detection (SOD) methods usually use the end-to-end network to extract the global or local information of the image such as contrast, spatial distribution and objectness. We find that the depth is one important cue of saliency that is neglected by previous works. For a single image, the depth denotes the relative distance of objects to observer, and the relatively closer object in the image usually attracts more human attention and has a higher saliency. In this paper, we proposed a deep convolution network to extract the depth information of the image to predict the saliency. Our network consists of two streams, depth stream and contrast stream. The first stream can predict the saliency brought by object depth through two deep networks. The second stream can extract the contrast information of image through a multi-scale network. The saliency prediction through the depth stream often has blurred boundaries, while the result of the contrast stream is more accurate in pixel level. So, we obtain the final saliency map through the combination of the two stream results. We compare with the state-of-the-art deep SOD methods on four public datasets. The experimental results show that the combination of the two streams can have more accurate performance.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.