Lip-Reading via Deep Neural Network Using Appearance-Based Visual Features

Fatemeh Vakhshiteh,Farshad Almasganj

doi:10.1109/icbme.2017.8430230

Abstract

Lip-reading, is visually interpreting lips movements in order to understand speech, when there is no access to the normal sound. Image processing techniques for lip-reading recognition has been widely applied in various kinds of applications. As an application, computer-based video system developed to provide lip-reading instruction to hearing-impaired adults and teenagers. Taking a step toward automating the process, challenges such as coarticulation phenomenon, homophone effect, insufficient training data per class, choice of features and speaker-dependency are faced. Finding a method to overcome these challenges is desirable. This paper describes a lip-reading model, highlighting the feature extraction and recognition parts. Certain arrangement of blocks are considered in a way to achieve optimal appearance-based features for feature extraction part, while a properly structured Deep Belief Network (DBN) is used for the recognition part. The challenging dataset of CUAVE is used in this study, and visual phone recognition (VPR) accuracies are reported on the phone-level. Proposed lip-reading recognizer is unique in its usage for all speakers. Our suggested method outperforms the conventional Hidden Markov Model (HMM)-based recognizer, and the best VPR accuracy of %45.63 is achieved, using the best DBN.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Lip-Reading via Deep Neural Network Using Appearance-Based Visual Features

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Classification of Electrocardiogram Signals with Deep Belief Networks
Meng Huanhuan ... Zhang Yue
-
Meng Huanhuan, et. al.Meng Huanhuan ... Zhang Yue
01 Dec 2014
01 Dec 2014

Combined deep learning classifiers for stock market prediction: integrating stock price and news sentiments
Shilpa B L ... Shambhavi B R
Kybernetes | VOL. 52
Shilpa B L, et. al.Shilpa B L ... Shambhavi B R
09 Nov 2021
Kybernetes | VOL. 52

LIP-READING VIA DEEP NEURAL NETWORKS USING HYBRID VISUAL FEATURES
Fatemeh Vakhshiteh ... Ahmad Nickabadi
Image Analysis & Stereology | VOL. 37
Fatemeh Vakhshiteh, et. al.Fatemeh Vakhshiteh ... Ahmad Nickabadi
09 Jul 2018
Image Analysis & Stereology | VOL. 37

Discriminative keyword spotting using triphones information and N-best search
Shima Tabibian ... Babak Nasersharif
Information Sciences | VOL. 423
Shima Tabibian, et. al.Shima Tabibian ... Babak Nasersharif
20 Sep 2017
Information Sciences | VOL. 423

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Lip-Reading via Deep Neural Network Using Appearance-Based Visual Features

Abstract

Talk to us

Similar Papers