A Comprehensive Review of Data‐Driven Co‐Speech Gesture Generation

S Nyatsanga,T Kucherenko,C Ahuja,M Neff,G E Henter

doi:10.1111/cgf.14776

S Nyatsanga, T Kucherenko + Show 3 more

Open Access

https://doi.org/10.1111/cgf.14776

Copy DOI

Journal: Computer Graphics Forum	Publication Date: May 1, 2023
Citations: 21	License type: publisher-specific-oa

Affiliation: University of California, Davis

Abstract

AbstractGestures that accompany speech are an essential part of natural and efficient embodied human communication. The automatic generation of such co‐speech gestures is a long‐standing problem in computer animation and is considered an enabling technology for creating believable characters in film, games, and virtual social spaces, as well as for interaction with social robots. The problem is made challenging by the idiosyncratic and non‐periodic nature of human co‐speech gesture motion, and by the great diversity of communicative functions that gestures encompass. The field of gesture generation has seen surging interest in the last few years, owing to the emergence of more and larger datasets of human gesture motion, combined with strides in deep‐learning‐based generative models that benefit from the growing availability of data. This review article summarizes co‐speech gesture generation research, with a particular focus on deep generative models. First, we articulate the theory describing human gesticulation and how it complements speech. Next, we briefly discuss rule‐based and classical statistical gesture synthesis, before delving into deep learning approaches. We employ the choice of input modalities as an organizing principle, examining systems that generate gestures from audio, text and non‐linguistic input. Concurrent with the exposition of deep learning approaches, we chronicle the evolution of the related training data sets in terms of size, diversity, motion quality, and collection method (e.g., optical motion capture or pose estimation from video). Finally, we identify key research challenges in gesture generation, including data availability and quality; producing human‐like motion; grounding the gesture in the co‐occurring speech in interaction with other speakers, and in the environment; performing gesture evaluation; and integration of gesture synthesis into applications. We highlight recent approaches to tackling the various key challenges, as well as the limitations of these approaches, and point toward areas of future development.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Comprehensive Review of Data‐Driven Co‐Speech Gesture Generation

Abstract

Talk to us

Similar Papers

More From: Computer Graphics Forum

Lead the way for us

Similar Papers

Combining deep generative and discriminative models for Bayesian semi-supervised learning
Jonathan Gordon ... José Miguel Hernández-Lobato
Pattern Recognition | VOL. 100
Jonathan Gordon, et. al.Jonathan Gordon ... José Miguel Hernández-Lobato
14 Dec 2019
Pattern Recognition | VOL. 100

Novel deep generative simultaneous recurrent model for efficient representation learning
M Alam ... K.M Iftekharuddin
Neural Networks | VOL. 107
M Alam, et. al.M Alam ... K.M Iftekharuddin
09 Aug 2018
Neural Networks | VOL. 107

Deep Generative Models in Engineering Design: A Review
Lyle Regenwetter ... Faez Ahmed
Journal of Mechanical Design | VOL. 144
Lyle Regenwetter, et. al.Lyle Regenwetter ... Faez Ahmed
18 Mar 2022
Journal of Mechanical Design | VOL. 144

Comparative Study of Deep Generative Models on Chemical Space Coverage.
Jie Zhang ... Ola Engkvist
Journal of Chemical Information and Modeling | VOL. 61
Jie Zhang, et. al.Jie Zhang ... Ola Engkvist
20 May 2021
Journal of Chemical Information and Modeling | VOL. 61

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Comprehensive Review of Data‐Driven Co‐Speech Gesture Generation

Abstract

Talk to us

Similar Papers

More From: Computer Graphics Forum