Abstract

The shape and movement of the human lips convey valuable visual information which is used in various applications including: automatic lip-reading (ALR), emotion recognition, biometric speaker identification, and virtual face animation. The image processing to extract visual information from the lips typically involves three stages: face detection, location of the region of interest (ROI), and lip segmentation. This research focuses on lip segmentation as the accuracy of this component is crucial to the performance of the overall system. The challenge of lip segmentation arises from variability in the speaker profile (colour, shape, facial hair, make-up); and the dynamic ROI which changes as the teeth and tongue appear during speech movements.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call