Generating Dance Videos Using Pose Transfer Generative Adversarial Network With Multiple Scale Region Extractor and Learnable Region Normalization

Hsu-Yung Cheng,Chih-Lung Lin,Chih-Chang Yu

doi:10.1109/mmul.2021.3113312

Abstract

In this article, we propose a pose transfer framework that can deal with large body motion to generate dance videos. To solve the problem of body shape deformation from large movements, a multiple scale region extractor (MSRE) is proposed. The features of each body region can be extracted from multiple layers of the encoder according to the body key points and passed through shortcuts to the decoder to reduce the spatial information loss. We add a region style loss calculated by the style representations of the body regions to the loss function to improve the quality of the generated images. In addition, the concept of learnable region normalization is integrated into the proposed framework to prevent introducing undesired mean and variance shifts by the corrupted regions during normalization. The experiments have shown that the proposed system can significantly improve the pose generation results compared with existing methods, especially when there are large body movements in the dancing poses.

Full Text