Abstract

Speech intelligibility can be improved by adding lip images to the speech signal. Thus lip movement synthesis plays an important role to realize a natural human-like face of computer agents. This paper proposes a novel, lip movement synthesis method from speech input based on the Hidden Markov Models (HMMs). The difficulties of lip movement synthesis are caused by coarticulation effects from preceding and succeeding phonemes. The proposed method gives a simple solution that generates context dependent lip parameters by looking ahead to the HMM state sequence obtained using context independent HMMs. In objective evaluation experiments, the proposed method is evaluated by the time-averaged error and the time-averaged differential error between synthesized lip parameters and original ones. The result shows that the time-averaged error and the time-averaged differential error of the HMM-based method with context independent lip parameters are 8.7% and 32% smaller than those obtained using a Vector Quantization (VQ) based method. Moreover, the time-averaged error and time-averaged differential error generated by the proposed HMM-based method with context dependent lip parameters are further reduced by 10.5% and 11% compared to the HMM-based method with the context independent lip parameters. The proposed HMM-based method with context dependent lip parameters mostly reduces the errors of phonemes /h/, /g/ and /k/. In subjective evaluation experiments, although differences in the audio-visual intelligibility between the synthesized lip parameters and the original ones are insignificant, the acceptability test to evaluate naturalness reflects the results of the objective evaluation. Mean opinion scores of acceptability for the VQ-based method and the proposed HMM-based method are 3.25 and 3.74, respectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.