Abstract
Visual speech recognition refers to the identification of utterances through the movements of lips, tongue, teeth, and other facial muscles of the speaker without using the acoustic signal. This work shows the relative benefits of both static and dynamic visual speech features for improved visual speech recognition. Two approaches for visual feature extraction have been considered: (1) an image transform based static feature approach in which Discrete Cosine Transform (DCT) is applied to each video frame and 6×6 triangle region coefficients are considered as features. Principal Component Analysis (PCA) is applied over all 60 features corresponding to the video frame to reduce the redundancy; the resultant 21 coefficients are taken as the static visual features. (2) Motion segmentation based dynamic feature approach in which the facial movements are segmented from the video file using motion history images (MHI). DCT is applied to the MHI and triangle region coefficients are taken as the dynamic visual features. Two types of experiments were done one with concatenated features and another with dimension reduced feature by using PCA to identify the utterances. The left-right continuous HMMs are used as visual speech classifier to classify nine MPEG-4 standard viseme consonants. The experimental result shows that the concatenated as well as dimension reduced features improve te visual speech recognition with a high accuracy of 92.45% and 92.15% respectively.
Full Text
Topics from this Paper
Static Visual Features
Dynamic Visual Features
Visual Speech Recognition
Visual Features
Dynamic Features
+ Show 5 more
Create a personalized feed of these topics
Get StartedTalk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Similar Papers
Signal, Image and Video Processing
Jun 11, 2020
EURASIP Journal on Image and Video Processing
Jan 1, 2008
Proceedings of 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, 2004.
Oct 20, 2004
Procedia Computer Science
Jan 1, 2020
Computer Speech & Language
Apr 1, 2010
Jan 27, 2021
Visual Speech Recognition
Jan 1, 2009
Speech Communication
Jun 1, 2017
Jan 27, 2021
The Journal of neuroscience : the official journal of the Society for Neuroscience
Oct 17, 2023
Aug 27, 2011
Feb 1, 2013