Abstract

Video inpainting is to fill the missing areas in the video frame with reasonable content based on the information of the known areas in the video. Although deep learning has achieved very good results in image processing, it is still very challenging in video inpainting. Recently, some studies have introduced the Transformer module into the problem of video inpainting, and achieved good results. However, the speed and effect cannot be guaranteed at the same time. Therefore, in order to solve the speed problem of video inpainting, we propose an improved 3D and 2D hybrid encoder network. At the same time, the Transformer module of other studies is introduced to extract the reference frame information far away from the target frame, so as to meet the time consistency and obtain a better video inpainting effect. In the test video sequences doped with static and moving masks, we made qualitative and quantitative evaluations respectively. Compared with recent research, our proposed video inpainting framework can further improve the speed of video inpainting while ensuring the same inpainting effect.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call