Abstract

Facial motion during speech is a direct consequence of vocal-tract motion which also shapes the acoustics of speech. This fact suggests that speech acoustics can be used to estimate face motion, and vice versa. Another kinematic–acoustic relation that occurs during production of speech is between head motion and fundamental frequency ( F 0). This paper focuses on the development of a system that takes speech acoustics as input, and gives as output the coefficients necessary to animate natural face and head motion. The results obtained are based on simultaneous measurements of face deformation, head motion and speech acoustics collected for two subjects during production of naturalistic sentences and spontaneous speech. The procedure for estimating face motion from speech acoustics first trains nonlinear estimators whose inputs are line spectrum frequency pair coefficients and whose outputs are marker positions on the face. These estimators are then applied to test data. The estimated marker trajectories are objectively compared with their measured counterparts, yielding correlation coefficients between 0.8 and 0.9. Linear estimators are used to relate F 0 and head motion. As F 0 to head motion is a one-to-many problem, constraints must be added for the estimation of head motion. This is done by computation of the co-dependence among head motion components. Finally, measured and estimated face and head motion data are used to animate a naturalistic talking head.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call