Abstract

In this paper, real-time detection and tracking of lips region of a talking person in natural scenes is addressed. In particular, we try to acquire numerical parameters to represent the lips information. Because, this information is very important for many applications, such as audio-visual speech recognition, robot perception, and interface of mobile devices. The difficulty lies in deformations and geometric change of lips, by speech and free camera work. Our proposed system is based on template matching with genetic algorithms (GAs). In our previous system, there is a trade-off between accuracy and a processing time. However, we can overcome this by two new methods: (a) a flexible control of a search domain, (b) inheritance of genetic information between video frames. We demonstrated the effectiveness of our proposed system by using some 5 seconds video sequences. The average results are that the accuracy is 94,44% and the processing time is 4.50 seconds

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call