Speech driven realistic mouth animation based on multi-modal unit selection

Dongmei Jiang,Werner Verhelst,Ilse Ravyse,Hichem Sahli

doi:10.1007/s12193-009-0015-7

Abstract

This paper presents a novel audio visual diviseme (viseme pair) instance selection and concatenation method for speech driven photo realistic mouth animation. Firstly, an audio visual diviseme database is built consisting of the audio feature sequences, intensity sequences and visual feature sequences of the instances. In the Viterbi based diviseme instance selection, we set the accumulative cost as the weighted sum of three items: 1) logarithm of concatenation smoothness of the synthesized mouth trajectory; 2) logarithm of the pronunciation distance; 3) logarithm of the audio intensity distance between the candidate diviseme instance and the target diviseme segment in the incoming speech. The selected diviseme instances are time warped and blended to construct the mouth animation. Objective and subjective evaluations on the synthesized mouth animations prove that the multimodal diviseme instance selection algorithm proposed in this paper outperforms the triphone unit selection algorithm in Video Rewrite. Clear, accurate, smooth mouth animations can be obtained matching well with the pronunciation and intensity changes in the incoming speech. Moreover, with the logarithm function in the accumulative cost, it is easy to set the weights to obtain optimal mouth animations.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Speech driven realistic mouth animation based on multi-modal unit selection

Abstract

Talk to us

Similar Papers

More From: Journal on Multimodal User Interfaces

Lead the way for us

Journal: Journal on Multimodal User Interfaces	Publication Date: Dec 1, 2008
Citations: 29

Similar Papers

A Comparative Survey of Instance Selection Methods applied to Non-Neural and Transformer-Based Text Classification
Washington Cunha ... Leonardo Rocha
ACM Computing Surveys | VOL. 55
Washington Cunha, et. al.Washington Cunha ... Leonardo Rocha
13 Jul 2023
ACM Computing Surveys | VOL. 55

Speech Driven Tongue Animation
Salvador Medina ... Denis Tome
-
Salvador Medina, et. al.Salvador Medina ... Denis Tome
01 Jun 2022
01 Jun 2022

An Articulatory Approach to Video-Realistic Mouth Animation
Lei Xie ... Zhi-Qiang Liu
-
Lei Xie, et. al. Lei Xie ... Zhi-Qiang Liu
14 May 2006
14 May 2006

Video Realistic Mouth Animation Based on an Audio Visual DBN Model with Articulatory Features and Constrained Asynchrony
Dongmei Jiang ... Peizhen Liu
-
Dongmei Jiang, et. al.Dongmei Jiang ... Peizhen Liu
01 Sep 2009
01 Sep 2009

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Speech driven realistic mouth animation based on multi-modal unit selection

Abstract

Talk to us

Similar Papers

More From: Journal on Multimodal User Interfaces