Abstract

In recent years, the emergence of fully convolutional neural networks (FCNs) has delivered significant success in the field of saliency detection. Although the different levels of FCNs layers can hold different types of information for salient object detection, it is still a challenging issue to find a generic method while integrating all relevant information synthetically with multi-level aggregation. In this paper, we present a novel multi-level aggregation method by following a U-shaped architecture of the VGG-16 network. As the shallower layers of FCNs contain the low-level integrated features which are capable of capturing the more details of salient objects, while the more profound layers that hold the high-level integrated features have more contextual information. To exploit all the relevant information, we extend the last four side-outputs of U-Net at the encoder and decoder sides and then utilize the concept of skip and short-connections to incorporate the high-level contextual knowledge with low-level details. Besides, we also integrate the recurrent convolutional layers (RCLs) into our model, which provide more deepness and enhance the capability to integrate more contextual knowledge. At last, we combine all the side-outputs into a final saliency map together for salient object detection. We evaluate the performance of the proposed model on six broadly used saliency detection benchmarks by comparing it with the other 11 state-of-the-art approaches. Experimental outcomes determine that our method achieves a favorable performance for all compared evaluation measures.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call