Flow driven attention network for video salient object detection

Feng Zhou,Qingshan Liu,Guodong Guo,Hui Shuai

doi:10.1049/iet-ipr.2019.0836

Abstract

Salient object detection has been revolutionised by convolutional neural network (CNN) recently. However, it is hard to transfer the state-of-the-art still-image based saliency detectors to videos directly, owing to the neglect of temporal contexts between frames. In this study, the authors propose a flow-driven attention network (FDAN) to exploit motion information for video salient object detection. FDAN consists of an appearance feature extractor, a motion-guided attention module and a saliency map regression module. It extracts the appearance feature per frame, refines appearance feature with optical flow and infers the ultimate saliency map, respectively. Motion-guided attention module is the core of FDAN, which extracts motion information in the form of attention. This attention mechanism is a two-branch CNN, fusing optical flow and appearance features. In addition, a shortcut connection is applied to the attention multiplied feature map for noise suppression intensively. Experimental results show that the proposed method can achieve performance on par with the state-of-the-art method flow-guided recurrent neural encoder on challenging benchmarks of Densely Annotated Video Segmentation and Freiburg-Berkeley Motion Segmentation while being two times faster in detection.

Full Text