Read my lips: Artificial intelligence word-level arabic lipreading system

Sundus Altorman,Waleed Dweik,Safa Ashour

doi:10.1016/j.eij.2022.06.001

Sundus Altorman, Waleed Dweik + Show 1 more

Open Access

https://doi.org/10.1016/j.eij.2022.06.001

Copy DOI

Journal: Egyptian Informatics Journal	Publication Date: Jul 1, 2022
License type: cc-by-nc-nd

Affiliation: University of Jordan

Abstract

Lipreading is the ability to recognize words or sentences from the mouth movements of a speaking person. This process is also known as Visual Speech Recognition (VSR). Lipreading has two main advantages: facilitate communication for people with hearing or speaking problems and aid speech recognition in noisy environments. In this paper, we propose a lipreading computing system capable of recognizing ten common Arabic words by performing word extraction from the mouth movements. The system receives a video of a person uttering an Arabic word as an input and outputs the text of the predicted word. During the implementation stage of the proposed system, three deep learning and neural network architectures are alternatively used to train, validate, and test the system using a locally collected and preprocessed dataset. The dataset contains 1051 videos and will be made available upon request. Moreover, a voting model that combines the three architectures is proposed. The highest testing accuracy (i.e. 82.84%) is achieved by leveraging the voting model.

Full Text