Mimic: Speaking Style Disentanglement for Speech-Driven 3D Facial Animation

Hui Fu,Ke Gong,Wenxiong Kang,Haojie Li,Tianshui Chen,Haifeng Zeng,Zeqing Wang,Keze Wang

doi:10.1609/aaai.v38i2.27945

Abstract

Speech-driven 3D facial animation aims to synthesize vivid facial animations that accurately synchronize with speech and match the unique speaking style. However, existing works primarily focus on achieving precise lip synchronization while neglecting to model the subject-specific speaking style, often resulting in unrealistic facial animations. To the best of our knowledge, this work makes the first attempt to explore the coupled information between the speaking style and the semantic content in facial motions. Specifically, we introduce an innovative speaking style disentanglement method, which enables arbitrary-subject speaking style encoding and leads to a more realistic synthesis of speech-driven facial animations. Subsequently, we propose a novel framework called Mimic to learn disentangled representations of the speaking style and content from facial motions by building two latent spaces for style and content, respectively. Moreover, to facilitate disentangled representation learning, we introduce four well-designed constraints: an auxiliary style classifier, an auxiliary inverse classifier, a content contrastive loss, and a pair of latent cycle losses, which can effectively contribute to the construction of the identity-related style space and semantic-related content space. Extensive qualitative and quantitative experiments conducted on three publicly available datasets demonstrate that our approach outperforms state-of-the-art methods and is capable of capturing diverse speaking styles for speech-driven 3D facial animation. The source code and supplementary video are publicly available at: https://zeqing-wang.github.io/Mimic/

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Mimic: Speaking Style Disentanglement for Speech-Driven 3D Facial Animation

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Similar Papers

An Improved Surface Simplification Method for Facial Expression Animation Based on Homogeneous Coordinate Transformation Matrix and Maximum Shape Operator
Juin-Ling Tseng
Mathematical Problems in Engineering | VOL. 2016
Juin-Ling TsengJuin-Ling Tseng
01 Jan 2015
Mathematical Problems in Engineering | VOL. 2016

Capture, Learning, and Synthesis of 3D Speaking Styles
Daniel Cudeiro ... Timo Bolkart
-
Daniel Cudeiro, et. al.Daniel Cudeiro ... Timo Bolkart
01 Jun 2019
01 Jun 2019

Real-time control of 3D facial animation
Changwei Luo ... Chen Jiang
-
Changwei Luo, et. al.Changwei Luo ... Chen Jiang
01 Jul 2014
01 Jul 2014

Perceptually guided expressive facial animation
...
-
, et. al. ...
07 Jul 2008
07 Jul 2008

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Mimic: Speaking Style Disentanglement for Speech-Driven 3D Facial Animation

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence