Abstract

In recent years talking heads have received a great deal of interest, both in their application to natural human-computer dialogue, and their benefit to the intelligibility of synthesized speech. A model for the realistic synthesis of visual speech animation is described. Images representing the key visual speech poses (visemes) are pre-recorded and labeled. Transitions between visemes are created by using an image morphing technique based upon the use of radial basis functions. Timing information from the festival speech synthesis system is used to plan the appropriate transitions to create realistic speech animation. A model of coarticulation is included in the system to improve the realism of articulatory motion.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call