Lip Reading using Convolutional Neural Network

Ijsrem Journal

doi:10.55041/ijsrem13521

Abstract

Lipreading is that the task of understanding speech by analysing the movement of lips. instead, it can be represented because the method of secret writing text from visual info generated by the speaker’s mouth movement. The task of perception depends conjointly on info provided by the context and data of the language. Lipreading, conjointly called visual speech recognition could be a difficult task for humans, particularly within the absence of context. many apparently identical lip movements will turn out totally different words, thus perception is associate inherently ambiguous drawback within the word level. Even skilled lipreaders come through low accuracy in word prediction for datasets with solely a couple of words. machine-driven perception has been a subject of interest for several years. A machine that may browse lip movement has nice utility in varied applications such as: machine-driven perception of speakers with broken vocal tracts, biometric person identification, multi-talker coincidental speech secret writing, silent-movie process and improvement of audio-visual speech recognition normally. Here we've used Convolutional Neural Network (CNN) from deep learning. The advancements in machine learning created machine-driven perception attainable. Also, we've used MIRACL-VC1 Dataset for our reference. The accuracy of our project came dead set as 76%. Key Words: convolutional neural network, deep learning, lip reading, converting speech to text, open CV, keras.

Full Text