Abstract
Visual speech recognition (VSR) is a method of reading speech by noticing the lip actions of the narrators. Visual speech significantly depends on the visual features derived from the image sequences. Visual speech recognition is a stimulating process that poses various challenging tasks to human machine-based procedures. VSR methods clarify the tasks by using machine learning. Visual speech helps people who are hearing impaired, laryngeal patients, and are in a noisy environment. In this research, authors developed our dataset for the Kannada Language. The dataset contained five words, which are Avanu, Bagge, Bari, Guruthu, Helida, and these words are randomly chosen. The average duration of each video is 1 s to 1.2 s. The machine learning method is used for feature extraction and classification. Here, authors applied VGG16 Convolution Neural Network for our custom dataset, and relu activation function is used to get an accuracy of 91.90% and the recommended system confirms the effectiveness of the system. The proposed output is compared with HCNN, ResNet-LSTM, Bi-LSTM, and GLCM-ANN, and evidenced the effectiveness of the recommended system.
Full Text
Topics from this Paper
Visual Speech Recognition
Visual Speech
Relu Activation Function
Custom Dataset
Machine Learning Method
+ Show 5 more
Create a personalized feed of these topics
Get StartedSimilar Papers
Journal of Experimental Psychology: Human Perception and Performance
Jan 1, 2001
Signal, Image and Video Processing
Jun 11, 2020
Language and Speech
Mar 1, 2000
Dec 1, 2005
The Visual Computer
Sep 13, 2012
Jan 1, 2009
Advances in Intelligent Systems and Computing
May 26, 2020
Speech Communication
Jun 1, 2017
Advances in Image and Video Technology
Jan 1, 2009
IPSJ Transactions on Computer Vision and Applications
Jan 1, 2010
Communications in Computer and Information Science
Jan 1, 2011
Journal of Korean Institute of Intelligent Systems
Jun 25, 2010
Jan 1, 2009
Dec 1, 2017
Visual Speech Recognition
Jan 1, 2009