Video Object Segmentation based on improved OSVOS

Shizhan Hong,Shengkai Xiang,Tieyong Cao,Zheng Fang,Lei Xiang,Xiaotong Deng

doi:10.1088/1742-6596/1314/1/012196

Shizhan Hong, Shengkai Xiang + Show 4 more

Open Access

https://doi.org/10.1088/1742-6596/1314/1/012196

Copy DOI

Abstract

Video object segmentation is a mainstream branch of current image processing direction. How to achieve deep learning from complete supervision to unsupervised is the key problem that people are trying to solve. In this process, One-Shot Video Object Segmentation (OSVOS) can successfully tackle the task of semi-supervised video object segmentation. It transfers the general semantic information learned on ImageNet to the foreground segmentation task, and then learns the mapping of a single annotated object in sequence. In this paper, based on the concept of OSVOS, an improved neural network structure with dilated convolution, multi-scale convolution fusion and skip layer is proposed. Dilated convolution can increase the size of the receiving field. Multi-scale convolution can obtain multi-scale feature maps and fuse them. Skip layers can transfer feature information from low layer to upper layer. All of them help to improve the final accuracy. The experimental results show that all indicators has been improved.

Full Text