Abstract

The goal of video inpainting is to fill the missing holes in a given video sequence. Due to the additional dimension, the video inpainting task is considerably more challenging to generate a plausible result than the image inpainting task. In this paper, we propose a novel video inpainting network based on deformable alignment, named Deformable Alignment Network (DANet). Given several consecutive images, DANet can align the image features from the global-level to pixel-level in a coarse-to-fine fashion. After alignment, DANet applies a fusion block to fuse the aligned features with neighboring frames and generates an inpainted frame. The coarse-to-fine alignment architecture ensures a better fusion result, which leads to temporal and spatial consistency combined with the fusion block. Experiment results demonstrate that DANet is more semantically correct and temporally coherent, and is comparable with state-of-the-art video inpainting methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call