Abstract

End-to-end training of a deep CNN-Based model for salient object detection usually requires a huge number of training samples with pixel-level annotations, which are costly and time-consuming to obtain. In this paper, we propose an approach that can utilize large amounts of web data for learning a deep salient object detection model. With thousands of images collected from the Web, we first employ several bottom-up saliency detection techniques to generate salient object masks for all images, and then use a novel quality evaluation method to pick out a subset of images with reliable masks for training. After that, we develop a self-training approach to boost the performance of our initial network, which iterates between the network training process and the training set updating process. Importantly, different from existing webly-supervised or weakly-supervised methods, our approach is able to automatically select reliable images for network training without requiring any human intervention (e.g., dividing images into different difficulty levels).Results of extensive experiments on several widely-used benchmarks demonstrate that our method has achieved state-of-the-art performance. It significantly outperforms existing unsupervised and weakly-supervised salient object detection methods, and achieves competitive or even better performance than fully supervised approaches.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call