A Critical Insight into Automatic Visual Speech Recognition System

Kiran Suryavanshi,Charansing Kayte,Suvarnsing Bhable

doi:10.1007/978-3-030-95711-7_1

Abstract

AbstractThis research paper investigated the robustness of the Automatic Visual Speech Recognition System (AVSR), for acoustic models that are based on GMM and DNNs. Most of the recent survey literature is surpassed in this article. Which shows how, over the last 30 years, analysis and product growth on AVSR robustness in noisy acoustic conditions has progressed? There are various categories of languages covered, with coverage, development processes, Corpus, and granularity varying. The key advantage of deep-learning tools, including a deeply convoluted neural network, a bi-directional long-term memory grid, a 3D convolutional neural network, and others, is that they are relatively easy to solve such problems, which are indissolubly linked to feature extraction and complex audiovisual fusion. Its objective is to act as an AVSR representative. KeywordsSpeech recognitionAVSRMFCCHMMCNNDNN

Full Text