Abstract

Body language is one of the most common ways of expressing human emotion. In this article, we make the first attempt to generate an action video with a specific emotion from a single person image. The goal of the emotion-based action generation task (EBAG) is to generate action videos expressing a specific type of emotion given a single reference image with a full human body. We divide the task into two parts and propose a two-stage framework to generate action videos with specified emotions. At the first stage, we propose an emotion-based pose sequence generation approach (EPOSE-GAN) for translating the emotion to a pose sequence. At the second stage, we generate the target video frames according to the three inputs including the source pose and the target pose as the motion information and the source image as the appearance reference by using conditional GAN model with an online training strategy. Our framework produces the pose sequence and transforms the action independently, which highlights the fundamental role that the high-level pose feature plays in generating action video with a specific emotion. The proposed method has been evaluated on the “Soul Dancer” dataset which is built for action emotion analysis and generation. The experimental results demonstrate that our framework can effectively solve the emotion-based action generation task. However, the gap in the details of the appearance between the generated action video and the real-world video still exists, which indicates that the emotion-based action generation task has great research potential.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call