An Improved Visual Speech Recognition of Isolated Words using Combined Pixel and Geometric Features

N Radha,A Shahina,A Nayeemulla Khan

doi:10.17485/ijst/2016/v9i44/102234

Abstract

Objectives: This paper proposes a method to improve the performance of a Visual Speech Recognition (VSR) system by combining the pixel-based and geometry-based features, so as to augment the performance of audio based Automatic Speech Recognition (ASR) systems in adverse conditions. Methods/Statistical Analysis: A video database comprising of 11000 utterances of isolated words, collected from 20 speakers, is used in this study. Pixel based features (DCT and DWT) and geometric features (Active Shape Model or ASM) are fused at two levels, one at the feature level and the other at the decision level. A simple Gaussian mixture HMM word model is built for feature level fusion, while a two stream HMM model is built for decision level fusion. Findings: The VSR system built using the combined features shows a significant improvement in performance when compared to individual VSR systems built using pixel and geometric based features. The accuracy of the individual system is 76% for geometric features, 64% for DCT and 72% for DWT pixel-based features. The performance improves for combined features with an accuracy of 80% for ASM+DCT and 84.7% for DWT+ASM. A weighted decision level fusion result in further improvement, with an accuracy of 84% for ASM+DCT and 92% for ASM+DWT. Application/Improvements: The combined VSR could be preferred over individual pixel/geometric feature based systems to augment the performance of audio based Automatic Speech Recognition (ASR) systems in adverse conditions. Further studies on improving the VSR system, which could be used in lieu of audio-based ASR systems in adverse situations, are being carried out.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Indian Journal of Science and Technology	Publication Date: Nov 24, 2016
Citations: 5	License type: cc-by

R Discovery Prime

R Discovery Prime

An Improved Visual Speech Recognition of Isolated Words using Combined Pixel and Geometric Features

Abstract

Talk to us

Similar Papers

More From: Indian Journal of Science and Technology

Lead the way for us

Similar Papers

Visual Speech Recognition using Fusion of Motion and Geometric Features
Radha N ... Shahina A
Procedia Computer Science | VOL. 171
Radha N, et. al.Radha N ... Shahina A
01 Jan 2020
Procedia Computer Science | VOL. 171

Lip-Reading Using Pixel-Based and Geometry-Based Features for Multimodal Human–Robot Interfaces
Denis Ivanko ... Alexey Karpov
-
Denis Ivanko, et. al.Denis Ivanko ... Alexey Karpov
30 Aug 2019
30 Aug 2019

A Survey on Visual Speech Recognition Approaches
N Radha ... A Nayeemulla Khan
-
N Radha, et. al.N Radha ... A Nayeemulla Khan
25 Mar 2021
25 Mar 2021

Appearance and shape-based hybrid visual feature extraction: toward audio–visual automatic speech recognition
Saswati Debnath ... Pinki Roy
Signal, Image and Video Processing | VOL. 15
Saswati Debnath, et. al.Saswati Debnath ... Pinki Roy
11 Jun 2020
Signal, Image and Video Processing | VOL. 15

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Improved Visual Speech Recognition of Isolated Words using Combined Pixel and Geometric Features

Abstract

Talk to us

Similar Papers

More From: Indian Journal of Science and Technology