Abstract

A text-driven 3D pronunciation visualization instruction system is proposed for computer-assisted language learning. Based on a 3D articulatory mesh model including appearance and internal articulators, both finite element method and anatomical model are used to synthesize the articulatory animation of phonemes by fitting the mesh model to the detected articulatory shapes in X-ray images. Visual co-articulation is modeled with a Hidden Markov Model trained on an articulatory speech corpus. Articulatory animations corresponding to all phonemes of a learned text are concatenated by visual co-articulation model to produce the speech synchronized articulatory animation. The experiments for Mandarin Chinese show the system can increase the pronunciation accuracy of learners.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call