An Automatic Lipreading System for Spoken Digits With Limited Training Data

S.L Wang,A.W.C Liew,S.H Leung,W.H Lau

doi:10.1109/tcsvt.2008.2004924

Abstract

It is well known that visual cues of lip movement contain important speech relevant information. This paper presents an automatic lipreading system for small vocabulary speech recognition tasks. Using the lip segmentation and modeling techniques we developed earlier, we obtain a visual feature vector composed of outer and inner mouth features from the lip image sequence for recognition. A spline representation is employed to transform the discrete-time sampled features from the video frames into the continuous domain. The spline coefficients in the same word class are constrained to have similar expression and are estimated from the training data by the EM algorithm. For the multiple-speaker/speaker-independent recognition task, an adaptive multimodel approach is proposed to handle the variations caused by various talking styles. After building the appropriate word models from the spline coefficients, a maximum likelihood classification approach is taken for the recognition. Lip image sequences of English digits from 0 to 9 have been collected for the recognition test. Two widely used classification methods, HMM and RDA, have been adopted for comparison and the results demonstrate that the proposed algorithm deliver the best performance among these methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Transactions on Circuits and Systems for Video Technology	Publication Date: Dec 1, 2008
Citations: 41	License type: other-oa

R Discovery Prime

R Discovery Prime

An Automatic Lipreading System for Spoken Digits With Limited Training Data

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Circuits and Systems for Video Technology

Lead the way for us

Similar Papers

Automatic Lipreading with Limited Training Data
S.L Wang ... S.H Leung
-
S.L Wang, et. al.S.L Wang ... S.H Leung
01 Jan 2006
01 Jan 2006

Auto-sorting commonly recovered plastics from waste household appliances and electronics using near-infrared spectroscopy
Xiaoyu Wu ... Linpeng Yao
Journal of Cleaner Production | VOL. 246
Xiaoyu Wu, et. al.Xiaoyu Wu ... Linpeng Yao
06 Oct 2019
Journal of Cleaner Production | VOL. 246

Region-based coding of images using a spline model
R Baseri ... J.W Modestino
-
R Baseri, et. al.R Baseri ... J.W Modestino
13 Nov 1994
13 Nov 1994

Combining deep features and hand-crafted features for abnormality detection in WCE images
Zahra Amiri ... Azeddine Beghdadi
Multimedia Tools and Applications | VOL. 83
Zahra Amiri, et. al.Zahra Amiri ... Azeddine Beghdadi
25 May 2023
Multimedia Tools and Applications | VOL. 83

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Automatic Lipreading System for Spoken Digits With Limited Training Data

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Circuits and Systems for Video Technology