Improving the Recognition Performance of Lip Reading Using the Concatenated Three Sequence Keyframe Image Technique

L Poomhiran,S Nuanmeesri,P Meesad

doi:10.48084/etasr.4102

Abstract

This paper proposes a lip reading method based on convolutional neural networks applied to Concatenated Three Sequence Keyframe Image (C3-SKI), consisting of (a) the Start-Lip Image (SLI), (b) the Middle-Lip Image (MLI), and (c) the End-Lip Image (ELI) which is the end of the pronunciation of that syllable. The lip area’s image dimensions were reduced to 32×32 pixels per image frame and three keyframes concatenate together were used to represent one syllable with a dimension of 96×32 pixels for visual speech recognition. Every three concatenated keyframes representing any syllable are selected based on the relative maximum and relative minimum related to the open lip’s width and height. The evaluation results of the model’s effectiveness, showed accuracy, validation accuracy, loss, and validation loss values at 95.06%, 86.03%, 4.61%, and 9.04% respectively, for the THDigits dataset. The C3-SKI technique was also applied to the AVDigits dataset, showing 85.62% accuracy. In conclusion, the C3-SKI technique could be applied to perform lip reading recognition.

Highlights

Deep learning applications, especially Convolutional Neural Network (CNN) applications, have recently achieved impressive success in diverse object detection and recognition tasks [1], CNNs face some challenges, in particular in video recognition
This paper proposes the application of the C3-SKI to a CNN for lip reading
The C3-SKI consisting of StartLip Image (SLI), Middle-Lip Image (MLI), and End-Lip Image (ELI) was tested in lip reading recognition on THDigits and AVDigits datasets

Summary

Introduction

Especially Convolutional Neural Network (CNN) applications, have recently achieved impressive success in diverse object detection and recognition tasks [1], CNNs face some challenges, in particular in video recognition. If the audio at the crucial moment is missing, it may result in the video’s contents being misunderstood [2]. These videos will be more useful if they were edited and the missing words or messages could be found. Most of the proposed solutions rely on the lip reading technique to help transcription by reading and observing the moving lips, including tongue and face to get the right words. The process of transcribing or translating the speech obtained by lip reading is a skill that requires learning and practice until becoming proficient at recognizing the lip movement or lip pattern related to the pronunciation of each syllable

Objectives

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Engineering, Technology & Applied Science Research	Publication Date: Apr 11, 2021
Citations: 6	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Improving the Recognition Performance of Lip Reading Using the Concatenated Three Sequence Keyframe Image Technique

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Engineering, Technology & Applied Science Research

Lead the way for us

Similar Papers

Sub-word Level Lip Reading With Visual Attention
K R Prajwal ... Triantafyllos Afouras
-
K R Prajwal, et. al.K R Prajwal ... Triantafyllos Afouras
01 Jun 2022
01 Jun 2022

Design and implementation of a lip reading system in smart phone environment
Young-Un Kim ... Sung-Tae Jung
-
Young-Un Kim, et. al. Young-Un Kim ... Sung-Tae Jung
01 Aug 2009
01 Aug 2009

An efficient algorithm for lip detection in color face images
Behrooz Zali-Vargahan ... Hashem Kalbkhani
-
Behrooz Zali-Vargahan, et. al.Behrooz Zali-Vargahan ... Hashem Kalbkhani
01 May 2013
01 May 2013

Lip Contour Extraction Based on Support Vector Machine Add Support
Xiaosheng Pan ... Jiangping Kong
-
Xiaosheng Pan, et. al.Xiaosheng Pan ... Jiangping Kong
01 Jan 2008
01 Jan 2008

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improving the Recognition Performance of Lip Reading Using the Concatenated Three Sequence Keyframe Image Technique

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Engineering, Technology &amp; Applied Science Research

More From: Engineering, Technology & Applied Science Research