Abstract

Image and video based human motions can be regarded as the deformation processes of person appearances, so motion transfer is usually treated as a pose guided image generation task and implemented in the 2D image plane. However, the 2D plane image generation lacks guidance of the original 3D motion information, which results in blur and shape distortions of the generated motion images. Therefore, we propose to simulate the generation process of real motion images by projecting the 3D human models, which are reconstructed from the training motion images and driven with target poses, into the 2D plane. We then take the 2D projections as the pose representations and input them into the generation model as they naturally inherit the 3D information from the original motions. Considering the unreliability on the invisible surface of the single image based human model reconstruction, we propose a sequential image based human model refinement module which exploits the complementary information between adjacent motion frames to refine the 3D human model. Furthermore, we propose a face-attention GAN model to conduct the final motion transfer, in which we use the Gaussian distribution to match the elliptical face region and design a face enhancement loss function since the faces in the generated motion images influence the performances very much. The generated motion images with reliable depth information, accurate shapes and clear faces demonstrate the effectiveness of the proposed method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call