Abstract

In recent years, the Convolutional Neural Network (CNN) has been widely used in various visual tasks because of its powerful feature extraction ability. Salient object detection methods based on CNN have also achieved great performance. Although a large number of feature information can be obtained through CNN, the key to improve the quality of the saliency maps is how to make full use of the high and low-level features and their relationships. Some previous works merged high and low-level features without processing the features, which resulted in the blurring of the saliency map, and even the inability to distinguish the foreground from the background in a complex environment. In order to solve the above problem, we propose an Attention guided Contextual Feature Fusion Network (ACFFNet) for salient object detection. There are mainly three modules in the proposed ACFFNet, including the Multi-field Channel Attention (MCA) module, Contextual Feature Fusion (CFF) module, and the feature Self-Refinement (SR) module. The MCA module selects features from different receptive fields, the CFF module can efficiently aggregate contextual features, and the SR module is able to repair the holes in the prediction maps caused by the contradictory response of different layers. In addition, we propose a Cross-Consistency Enhancement (CCE) loss to guide the network to focus on more detailed information and highlight the difference between foreground and background. Experimental results on six benchmark datasets show that the proposed method outperforms the state-of-the-art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call