Abstract

AbstractRecent advances in generative adversarial networks (GANs) have shown tremendous success for facial expression generation tasks. However, generating vivid and expressive facial expressions at Action Units (AUs) level is still challenging, due to the fact that automatic facial expression analysis for AU intensity itself is an unsolved difficult task. In this paper, we propose a novel synthesis‐by‐analysis approach by leveraging the power of GAN framework and state‐of‐the‐art AU detection model to achieve better results for AU‐driven facial expression generation. Specifically, we design a novel discriminator architecture by modifying the patch‐attentive AU detection network for AU intensity estimation and combine it with a global image encoder for adversarial learning to force the generator to produce more expressive and realistic facial images. We also introduce a balanced sampling approach to alleviate the imbalanced learning problem for AU synthesis. Extensive experimental results on DISFA and DISFA+ show that our approach outperforms the state‐of‐the‐art in terms of photo‐realism and expressiveness of the facial expression quantitatively and qualitatively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call