Visual Speech Recognition

Supriya Patil Supriya Patil,Pratiksha Jagdale Pratiksha Jagdale,Vaibhav Dhoble Vaibhav Dhoble,Saatvik Gawade Saatvik Gawade,Rohan Jinde Rohan Jinde

doi:10.48175/ijarsct-2874

Visual Speech Recognition

Supriya Patil Supriya Patil, Pratiksha Jagdale Pratiksha Jagdale + Show 3 more

Open Access

https://doi.org/10.48175/ijarsct-2874

Copy DOI

Journal: International Journal of Advanced Research in Science, Communication and Technology	Publication Date: Mar 21, 2022
License type: cc-by

#Side-face Images #Mobile Situations + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

The audio-visual speech recognition approach attempts to boost noise-robustness in mobile situations by extracting lip movement from side-face images. Although earlier bimodal speech recognition algorithms used frontal face (lip) images, these approaches are difficult for consumers to utilize because they need them to talk while holding a device with a camera in front of their face. Our proposed solution, which uses a small camera put in a handset to capture lip movement, is more natural, simple, and convenient. This approach also effectively avoids a reduction in the input speech's signal-to-noise ratio (SNR). Optical-flow analysis extracts visual features, which are then coupled with audio features in the context of CNN-based recognition.

Full Text