High-Quality Video Generation from Static Structural Annotations

Lu Sheng,Junting Pan,Jiaming Guo,Jing Shao,Chen Change Loy

doi:10.1007/s11263-020-01334-x

Abstract

This paper proposes a novel unsupervised video generation that is conditioned on a single structural annotation map, which in contrast to prior conditioned video generation approaches, provides a good balance between motion flexibility and visual quality in the generation process. Different from end-to-end approaches that model the scene appearance and dynamics in a single shot, we try to decompose this difficult task into two easier sub-tasks in a divide-and-conquer fashion, thus achieving remarkable results overall. The first sub-task is an image-to-image (I2I) translation task that synthesizes high-quality starting frame from the input structural annotation map. The second image-to-video (I2V) generation task applies the synthesized starting frame and the associated structural annotation map to animate the scene dynamics for the generation of a photorealistic and temporally coherent video. We employ a cycle-consistent flow-based conditioned variational autoencoder to capture the long-term motion distributions, by which the learned bi-directional flows ensure the physical reliability of the predicted motions and provide explicit occlusion handling in a principled manner. Integrating structural annotations into the flow prediction also improves the structural awareness in the I2V generation process. Quantitative and qualitative evaluations over the autonomous driving and human action datasets demonstrate the effectiveness of the proposed approach over the state-of-the-art methods. The code has been released: https://github.com/junting/seg2vid .

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

High-Quality Video Generation from Static Structural Annotations

Abstract

Talk to us

Similar Papers

More From: International Journal of Computer Vision

Lead the way for us

Journal: International Journal of Computer Vision	Publication Date: May 28, 2020
Citations: 2

Similar Papers

Video Generation From Single Semantic Label Map
Junting Pan ... Xu Jia
-
Junting Pan, et. al.Junting Pan ... Xu Jia
01 Jun 2019
01 Jun 2019

Development of an End-to-End Deep Learning Framework for Sign Language Recognition, Translation, and Video Generation
B Natarajan ... E Rajalakshmi
IEEE Access | VOL. 10
B Natarajan, et. al.B Natarajan ... E Rajalakshmi
01 Jan 2021
IEEE Access | VOL. 10

ODD-VGAN: Optimised Dual Discriminator Video Generative Adversarial Network for Text-to-Video Generation with Heuristic Strategy
Rayeesa Mehmood ... Kaiser J Giri
Journal of Information & Knowledge Management | VOL. -
Rayeesa Mehmood, et. al.Rayeesa Mehmood ... Kaiser J Giri
29 Jul 2023
Journal of Information & Knowledge Management | VOL. -

Model-based recognition of human walking in dynamic scenes
J.C Cheng ... J.M.F Moura
-
J.C Cheng, et. al.J.C Cheng ... J.M.F Moura
23 Jun 1997
23 Jun 1997

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

High-Quality Video Generation from Static Structural Annotations

Abstract

Talk to us

Similar Papers

More From: International Journal of Computer Vision