Method and System for Aligning Natural and Synthetic Video to Speech Synthesis

Andrea Basso

doi:10.1121/1.3592860

Abstract

According to MPEG-4's TTS architecture, facial animation can be driven by two streams simultaneously—text, and Facial Animation Parameters. In this architecture, text input is sent to a Text-To-Speech converter at a decoder that drives the mouth shapes of the face. Facial Animation Parameters are sent from an encoder to the face over the communication channel. The present invention includes codes (known as bookmarks) in the text string transmitted to the Text-to-Speech converter, which bookmarks are placed between words as well as inside them. According to the present invention, the bookmarks carry an encoder time stamp. Due to the nature of text-to-speech conversion, the encoder time stamp does not relate to real-world time, and should be interpreted as a counter. In addition, the Facial Animation Parameter stream carries the same encoder time stamp found in the bookmark of the text. The system of the present invention reads the bookmark and provides the encoder time stamp as well as a real-time time stamp to the facial animation system. Finally, the facial animation system associates the correct facial animation parameter with the real-time time stamp using the encoder time stamp of the bookmark as a reference.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Method and System for Aligning Natural and Synthetic Video to Speech Synthesis

Abstract

Talk to us

Similar Papers

More From: The Journal of the Acoustical Society of America

Lead the way for us

Similar Papers

A New Method of 3D Facial Expression Animation
Shuo Sun ... Chunbao Ge
Journal of Applied Mathematics | VOL. 2014
Shuo Sun, et. al.Shuo Sun ... Chunbao Ge
01 Jan 2014
Journal of Applied Mathematics | VOL. 2014

Visual Estimation and Compression of Facial Motion Parameters—Elements of a 3D Model-Based Video Coding System
Hai Tao ... Thomas S Huang
International Journal of Computer Vision | VOL. 50
Hai Tao, et. al.Hai Tao ... Thomas S Huang
01 Jan 2002
International Journal of Computer Vision | VOL. 50

Parameter analysis and synthesis for MPEG4 facial animation
Lu Yu ... Wanwei Gong
-
Lu Yu, et. al. Lu Yu ... Wanwei Gong
18 Jun 2001
18 Jun 2001

Facial Animation and Analysis Using 2D+3D Facial Motion Tracking
Chan-Su Lee ... Sang-Heon Lee
-
Chan-Su Lee, et. al.Chan-Su Lee ... Sang-Heon Lee
01 Jan 2010
01 Jan 2010

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Method and System for Aligning Natural and Synthetic Video to Speech Synthesis

Abstract

Talk to us

Similar Papers

More From: The Journal of the Acoustical Society of America