Abstract

In this paper, we present a real time lip-synch system that activates 2-D avatar's lip motion in synch with incoming speech utterance. To achieve the real time operation of the system, the processing time was minimized by "merge and split" procedures resulting in coarse-to-fine phoneme classification. At each stage of phoneme classification, the support vector machine (SVM) method was applied to reduce the computational load while maintaining the desired accuracy. The coarse-to-fine phoneme classification, is accomplished via two_stages of feature extraction: in the first stage, each speech frame is acoustically analyzed for three classes of lip opening using Mel Frequency Cepstral Coefficients (MFCC) as a feature; in the second stage, each frame is further refined for detailed lip shape using formant information. The method was implemented in 2-D lip animation and it was demonstrated that the system was effective in accomplishing real-time lip-synch. This approach was tested on a PC using the Microsoft Visual Studio with an Intel Pentium IV 1.4 Giga Hz CPU and 384 MB RAM. It was observed that the methods of phoneme merging and SVM achieved about twice the speed in recognition than the method employing the Hidden Markov Model (HMM). A typical latency time per a single frame observed using the proposed method was in the order of 18.22 milliseconds while an HMM method under identical conditions resulted about 30.67 milliseconds.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.