Recent deep learning-based methods for saliency detection have proved the effectiveness of integrating features with different scales. They usually design various complex architectures of network, e.g., multiple networks, to explore the multi-scale information of images, which is expensive in computation and memory. Feature maps produced with different subsampling convolutional layers have different spatial resolutions; therefore, they can be used as the multi-scale features to reduce the costs. In this paper, by exploiting the in-network feature hierarchy of convolutional networks, we propose a novel multi-scale network for saliency detection (MSNSD) consisting of three modules, i.e., bottom-up feature extraction, top-down feature connection and multi-scale saliency prediction. Moreover, to further boost the performance of MSNSD, an input image-aware saliency aggregation method is proposed based on the ridge regression, which combines MSNSD with some well-performed handcrafted shallow models. Extensive experiments on several benchmarks show that the proposed MSNSD outperforms the state-of-the-art saliency methods with less computational and memory complexity. Meanwhile, our aggregation method for saliency detection is effective and efficient to combine deep and shallow models and make them complementary to each other.
Read full abstract