Improving diversity of speech‐driven gesture generation with memory networks as dynamic dictionaries

Zeyu Zhao,Guixuan Zhang,Jie Liu,Nan Gao,Shuwu Zhang,Zhi Zeng

doi:10.1049/cit2.12321

Abstract

AbstractGenerating co‐speech gestures for interactive digital humans remains challenging because of the indeterministic nature of the problem. The authors observe that gestures generated from speech audio or text by existing neural methods often contain less movement shift than expected, which can be viewed as slow or dull. Thus, a new generative model coupled with memory networks as dynamic dictionaries for speech‐driven gesture generation with improved diversity is proposed. More specifically, the dictionary network dynamically stores connections between text and pose features in a list of key‐value pairs as the memory for the pose generation network to look up; the pose generation network then merges the matching pose features and input audio features for generating the final pose sequences. To make the improvements more accurately measurable, a new objective evaluation metric for gesture diversity that can remove the influence of low‐quality motions is also proposed and tested. Quantitative and qualitative experiments demonstrate that the proposed architecture succeeds in generating gestures with improved diversity.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: CAAI Transactions on Intelligence Technology	Publication Date: Apr 22, 2024
Citations: 1	License type: CC BY-NC-ND 4.0

R Discovery Prime

R Discovery Prime

Improving diversity of speech‐driven gesture generation with memory networks as dynamic dictionaries

Abstract

Talk to us

Similar Papers

More From: CAAI Transactions on Intelligence Technology

Lead the way for us

Similar Papers

Generating Diverse Gestures from Speech Using Memory Networks as Dynamic Dictionaries
Zeyu Zhao ... Zhi Zeng
-
Zeyu Zhao, et. al.Zeyu Zhao ... Zhi Zeng
01 Aug 2022
01 Aug 2022

Learning hierarchical discrete prior for co-speech gesture generation
Jian Zhang ... Osamu Yoshie
Neurocomputing | VOL. 595
Jian Zhang, et. al.Jian Zhang ... Osamu Yoshie
09 May 2024
Neurocomputing | VOL. 595

ExpressGesture: Expressive gesture generation from speech through database matching
Ylva Ferstl ... Michael Neff
Computer Animation and Virtual Worlds | VOL. 32
Ylva Ferstl, et. al.Ylva Ferstl ... Michael Neff
31 May 2021
Computer Animation and Virtual Worlds | VOL. 32

DiT-Gesture: A Speech-Only Approach to Stylized Gesture Generation
Fan Zhang ... Zhaohan Wang
Electronics | VOL. 13
Fan Zhang, et. al.Fan Zhang ... Zhaohan Wang
27 Apr 2024
Electronics | VOL. 13

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improving diversity of speech‐driven gesture generation with memory networks as dynamic dictionaries

Abstract

Talk to us

Similar Papers

More From: CAAI Transactions on Intelligence Technology