Speech driven photo realistic facial animation based on an articulatory DBN model and AAM features

Dongmei Jiang,Yanning Zhang,Yong Zhao,Hichem Sahli

doi:10.1007/s11042-013-1610-x

Abstract

This paper presents a photo realistic facial animation synthesis approach based on an audio visual articulatory dynamic Bayesian network model (AF_AVDBN), in which the maximum asynchronies between the articulatory features, such as lips, tongue and glottis/velum, can be controlled. Perceptual Linear Prediction (PLP) features from audio speech, as well as active appearance model (AAM) features from face images of an audio visual continuous speech database, are adopted to train the AF_AVDBN model parameters. Based on the trained model, given an input audio speech, the optimal AAM visual features are estimated via a maximum likelihood estimation (MLE) criterion, which are then used to construct face images for the animation. In our experiments, facial animations are synthesized for 20 continuous audio speech sentences, using the proposed AF_AVDBN model, as well as the state-of-art methods, being the audio visual state synchronous DBN model (SS_DBN) implementing a multi-stream Hidden Markov Model, and the state asynchronous DBN model (SA_DBN). Objective evaluations on the learned AAM features show that much more accurate visual features can be learned from the AF_AVDBN model. Subjective evaluations show that the synthesized facial animations using AF_AVDBN are better than those using the state based SA_DBN and SS_DBN models, in the overall naturalness and matching accuracy of the mouth movements to the speech content.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Speech driven photo realistic facial animation based on an articulatory DBN model and AAM features

Abstract

Talk to us

Similar Papers

More From: Multimedia Tools and Applications

Lead the way for us

Journal: Multimedia Tools and Applications	Publication Date: Jul 27, 2013
Citations: 9

Similar Papers

Facial image de-identification using identiy subspace decomposition
Hehua Chi ... Yu Hen Hu
-
Hehua Chi, et. al.Hehua Chi ... Yu Hen Hu
01 May 2014
01 May 2014

Learning Gabor Features for Facial Age Estimation
Cuixian Chen ... Yishi Wang
-
Cuixian Chen, et. al.Cuixian Chen ... Yishi Wang
01 Jan 2010
01 Jan 2010

Audio visual speech recognition based on multi-stream DBN models with Articulatory Features
Dong-Mei Jiang ... Hichem Sahli
-
Dong-Mei Jiang, et. al.Dong-Mei Jiang ... Hichem Sahli
01 Nov 2010
01 Nov 2010

Research into Human Face Recognition Algorithm Based on Active Appearance Model Feature Positioning
Siming Meng
Journal of Computational and Theoretical Nanoscience | VOL. 13
Siming MengSiming Meng
01 Oct 2016
Journal of Computational and Theoretical Nanoscience | VOL. 13

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Speech driven photo realistic facial animation based on an articulatory DBN model and AAM features

Abstract

Talk to us

Similar Papers

More From: Multimedia Tools and Applications