Editable Co-Speech Gesture Synthesis Enhanced with Individual Representative Gestures

Yihua Bao,Nan Gao,Dongdong Weng

doi:10.3390/electronics13163315

Abstract

Co-speech gesture synthesis is a challenging task due to the complexity and uncertainty between gestures and speech. Gestures that accompany speech (i.e., Co-Speech Gesture) are an essential part of natural and efficient embodied human communication, as they work in tandem with speech to convey information more effectively. Although data-driven approaches have improved gesture synthesis, existing deep learning-based methods use deterministic modeling which could lead to averaging out predicted gestures. Additionally, these methods lack control over gesture generation such as user editing of generated results. In this paper, we propose an editable gesture synthesis method based on a learned pose script, which disentangles gestures into individual representative and rhythmic gestures to produce high-quality, diverse and realistic poses. Specifically, we first detect the time of occurrence of gestures in video sequences and transform them into pose scripts. Regression models are then built to predict the pose scripts. Next, learned pose scripts are used for gesture synthesis, while rhythmic gestures are modeled using a variational auto-encoder and a one-dimensional convolutional network. Moreover, we introduce a large-scale Chinese co-speech gesture synthesis dataset with multimodal annotations for training and evaluation, which will be publicly available to facilitate future research. The proposed method allows for the re-editing of generated results by changing the pose scripts for applications such as interactive digital humans. The experimental results show that this method generates more quality, more diverse, and realistic gestures than other existing methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Editable Co-Speech Gesture Synthesis Enhanced with Individual Representative Gestures

Abstract

Talk to us

Similar Papers

More From: Electronics

Lead the way for us

Journal: Electronics	Publication Date: Aug 21, 2024
License type: CC BY 4.0

Similar Papers

Salient Co-Speech Gesture Synthesizing with Discrete Motion Representation
Zijie Ye ... Junliang Xing
-
Zijie Ye, et. al.Zijie Ye ... Junliang Xing
04 Jun 2023
04 Jun 2023

DiffuseStyleGesture: Stylized Audio-Driven Co-Speech Gesture Generation with Diffusion Models
Sicheng Yang ... Minglei Li
-
Sicheng Yang, et. al.Sicheng Yang ... Minglei Li
01 Aug 2023
01 Aug 2023

A Comprehensive Review of Data‐Driven Co‐Speech Gesture Generation
S Nyatsanga ... G E Henter
Computer Graphics Forum | VOL. 42
S Nyatsanga, et. al.S Nyatsanga ... G E Henter
01 May 2023
Computer Graphics Forum | VOL. 42

A Review of Evaluation Practices of Gesture Generation in Embodied Conversational Agents
Pieter Wolfert ... Tony Belpaeme
IEEE Transactions on Human-Machine Systems | VOL. 52
Pieter Wolfert, et. al.Pieter Wolfert ... Tony Belpaeme
01 Jun 2022
IEEE Transactions on Human-Machine Systems | VOL. 52

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Editable Co-Speech Gesture Synthesis Enhanced with Individual Representative Gestures

Abstract

Talk to us

Similar Papers

More From: Electronics