Analyzing Visible Articulatory Movements in Speech Production For Speech-Driven 3D Facial Animation

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Speech-driven 3D facial animation aims to generate realistic facial meshes based on input speech signals. However, due to a lack of understanding of visible articulatory movements, current state-of-the-art methods result in inaccurate lip and jaw movements. Traditional evaluation metrics, such as lip vertex error (LVE), often fail to represent the quality of visual results. Based on our observation, we reveal the problems with existing evaluation metrics and raise the necessity for separate evaluation approaches for 3D axes. Comprehensive analysis shows that most recent methods struggle to precisely predict lip and jaw movements in 3D space.

Save Icon
Up Arrow
Open/Close