Abstract

Video object segmentation is a mainstream branch of current image processing direction. How to achieve deep learning from complete supervision to unsupervised is the key problem that people are trying to solve. In this process, One-Shot Video Object Segmentation (OSVOS) can successfully tackle the task of semi-supervised video object segmentation. It transfers the general semantic information learned on ImageNet to the foreground segmentation task, and then learns the mapping of a single annotated object in sequence. In this paper, based on the concept of OSVOS, an improved neural network structure with dilated convolution, multi-scale convolution fusion and skip layer is proposed. Dilated convolution can increase the size of the receiving field. Multi-scale convolution can obtain multi-scale feature maps and fuse them. Skip layers can transfer feature information from low layer to upper layer. All of them help to improve the final accuracy. The experimental results show that all indicators has been improved.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call