A hybrid visual feature extraction method for audio-visual speech recognition

Guanyong Wu Guanyong Wu,Jie Zhu Jie Zhu,Haihua Xu Haihua Xu

doi:10.1109/icip.2009.5413573

Abstract

In this paper, a hybrid visual feature extraction method that combines the extended locally linear embedding (LLE) with visemic linear discriminant analysis (LDA) was presented for the audio-visual speech recognition (AVSR). Firstly the extended LLE is presented to reduce the dimension of the mouth images, which constrains the scope of finding mouth data neighborhood to the corresponding individual's dataset instead of the whole dataset, and then maps the high dimensional mouth image matrices into a low-dimensional Euclidean space. Secondly we project the feature vectors on the visemic linear discriminant space to find the optimal classification. Finally, in the audio-visual fusion period, the minimum classification error (MCE) training based on the segmental generalized probabilistic descent (GPD) is applied to audio and visual stream weights optimization. Experimental results conducted the CUAVE database show that the proposed method achieves a significant performance than that of the classical PCA and LDA based method in visual-only speech recognition. Further experimental results show the robustness of the MCE based discriminative training method in noisy environment.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A hybrid visual feature extraction method for audio-visual speech recognition

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Audio-visual speech recognition using minimum classification error training
C Miyajima ... K Tokuda
-
C Miyajima, et. al.C Miyajima ... K Tokuda
11 Dec 2000
11 Dec 2000

Speech pattern classification using Large Geometric Margin Minimum Classification Error training
Mikiyo Kitaoka ... Hisashi Kawai
-
Mikiyo Kitaoka, et. al.Mikiyo Kitaoka ... Hisashi Kawai
01 Nov 2015
01 Nov 2015

An ellipsoid constrained quadratic programming (ECQP) approach to MCE training of MQDF-based classifiers for handwriting recognition
Yongqiang Wang ... Qiang Huo
-
Yongqiang Wang, et. al.Yongqiang Wang ... Qiang Huo
01 Dec 2008
01 Dec 2008

Minimum classification error vs. maximum margin: How should we penalize unseen samples?
Shigeru Katagiri ... Hideyuki Watanabe
-
Shigeru Katagiri, et. al.Shigeru Katagiri ... Hideyuki Watanabe
01 May 2012
01 May 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A hybrid visual feature extraction method for audio-visual speech recognition

Abstract

Talk to us

Similar Papers