Abstract

It is very important to accurately track the mouth of a talking person for many applications, such as face recognition, audiovisual speech recognition and human computer interaction. This is in general a difficult problem due to the complexity of shapes, colors, textures, and changing lighting conditions. In this paper we develop techniques for inner lip feature extraction using a matching function based on a color module and a gradient module. Our numerical results show that the extraction using both modules outperforms that with color module only. From the extracted continuous lip contours, facial animation parameters (FAP) are extracted which are used to drive an MPEG-4 decoder. FAP are also applied in our audio-visual automatic speech recognition (AV-ASR) system to improve the recognition rate.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call