Optical flow networks have been widely utilized for video saliency detection (VSD) due to their effective performance in capturing the motion of objects. However, the use of optical flow blurs the edges of salient objects and leads to the problems of poorly defined object boundaries. To address this issue, we propose an optical flow-based edge-weighted loss function, to train a network called Flow-Edge-Net, which can balance the weights of the foreground and background information at the edges of video frames. It has achieved superior performance in detecting salient boundaries. Specifically, we propose two complementary encoding and decoding networks based on the concept of decoupling. That is, the optical flow network focuses on moving objects, while the edge network, based on the encoder-decoder structure, focuses on edge information. As the two networks output features of the same dimension and are from the same input, our proposed self-designed adaptive weighted feature fusion module can compare and integrate the edge information and location information from the two networks through adaptive weighting. The proposed method has been evaluated on five widely used databases. Experiment results demonstrate the superior performance of the proposed Flow-Edge-Net in locating salient objects, with accurate and refined edges. The proposed method achieves superior performance over the state-of-the-art methods in detecting salient objects in videos.
Read full abstract