Visual Speech Recognition using Convolutional Neural Network

B Soundarya,R Krishnaraj,S Mythili

doi:10.1088/1757-899x/1084/1/012020

B Soundarya, R Krishnaraj + Show 1 more

Open Access

https://doi.org/10.1088/1757-899x/1084/1/012020

Copy DOI

Abstract

Visual speech Recognition or Lip reading is used for teaching differently abled persons to communicate with others. It has been determining speech by looking at the movement of the lips. There are many practical difficulties in traditional lip-reading recognition systems like complicated image processing, difficult to teach classifiers and recognition processes will take a long time. In our paper, we proposed the use of convolutional neural networks- Hidden Markov model (CNN-HMM) in lip reading. Since CNN will assign importance to an input image, it is easier to see a difference among the images. HMM used to handle the dynamics of the image sequence. First we convert the incoming video into images and these images selected for further operation. HMM provides a highly reliable way of speech recognition.

Full Text