Abstract
This paper is concerned with the problem of synthesizing animating face driven by new audio sequence, which is not present in the previously recorded database. Future video frames and past video frames influence the dynamics of current video frame, so the dynamics of speech and facial expressions needed to be learned to model an efficient speech driven facial animation. We have incorporated the features of future frames and past frames along with current frame feature to derive the features of complex current frame. K nearest neighbor algorithm is used to derive the simple current frame from complex current frame in testing phase. The inertia of facial muscles is different than inertia of vocal organs, so the change in speech features has different rate than change in video frame features corresponding to the speech. We have incorporated an inter-frame distance vector as feature of speech and video and used audio-video hidden Markov model for mapping the speech features into video features.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.