Pop‐net: A self‐growth network for popping out the salient object in videos

Hui Yin,Lin Yang,Jin Wan,Ning Chen

doi:10.1049/cvi2.12032

Abstract

It is a big challenge for unsupervised video segmentation without any object annotation or prior knowledge. In this article, we formulate a completely unsupervised video object segmentation network which can pop out the most salient object in an input video by self-growth, called Pop-Net. Specifically, in this article, a novel self-growth strategy which helps a base segmentation network to gradually grow to stick out the salient object as the video goes on, is introduced. To solve the sample generation problem for the unsupervised method, the sample generation module which fuses the appearance and motion saliency is proposed. Furthermore, the proposed sample optimization module improves the samples by using contour constrains for each self-growth step. Experimental results on several datasets (DAVIS, DAVSOD, VideoSD, Segtrack-v2) show the effectiveness of the proposed method. In particular, the state-of-the-art methods on completely unfamiliar datasets (no fine-tuned datasets) are performed.

Full Text