Visual Lip-Reading for Quranic Arabic Alphabets and Words Using Deep Learning

Nada Faisal Aljohani,Emad Sami Jaha

doi:10.32604/csse.2023.037113

Abstract

The continuing advances in deep learning have paved the way for several challenging ideas. One such idea is visual lip-reading, which has recently drawn many research interests. Lip-reading, often referred to as visual speech recognition, is the ability to understand and predict spoken speech based solely on lip movements without using sounds. Due to the lack of research studies on visual speech recognition for the Arabic language in general, and its absence in the Quranic research, this research aims to fill this gap. This paper introduces a new publicly available Arabic lip-reading dataset containing 10490 videos captured from multiple viewpoints and comprising data samples at the letter level (i.e., single letters (single alphabets) and Quranic disjoined letters) and in the word level based on the content and context of the book <i>Al-Qaida Al-Noorania</i>. This research uses visual speech recognition to recognize spoken Arabic letters (Arabic alphabets), Quranic disjoined letters, and Quranic words, mainly phonetic as they are recited in the Holy Quran according to Quranic study aid entitled Al-Qaida Al-Noorania. This study could further validate the correctness of pronunciation and, subsequently, assist people in correctly reciting Quran. Furthermore, a detailed description of the created dataset and its construction methodology is provided. This new dataset is used to train an effective pre-trained deep learning CNN model throughout transfer learning for lip-reading, achieving the accuracies of 83.3%, 80.5%, and 77.5% on words, disjoined letters, and single letters, respectively, where an extended analysis of the results is provided. Finally, the experimental outcomes, different research aspects, and dataset collection consistency and challenges are discussed and concluded with several new promising trends for future work.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Visual Lip-Reading for Quranic Arabic Alphabets and Words Using Deep Learning

Abstract

Talk to us

Similar Papers

More From: Computer Systems Science and Engineering

Lead the way for us

Journal: Computer Systems Science and Engineering	Publication Date: Jan 1, 2023
License type: cc-by

Similar Papers

Can We Read Speech Beyond the Lips? Rethinking RoI Selection for Deep Visual Speech Recognition
Yuanhang Zhang ... Xilin Chen
-
Yuanhang Zhang, et. al.Yuanhang Zhang ... Xilin Chen
01 Nov 2020
01 Nov 2020

Prompt Tuning of Deep Neural Networks for Speaker-adaptive Visual Speech Recognition.
Minsu Kim ... Yong Man Ro
IEEE transactions on pattern analysis and machine intelligence | VOL. PP
Minsu Kim, et. al.Minsu Kim ... Yong Man Ro
01 Jan 2024
IEEE transactions on pattern analysis and machine intelligence | VOL. PP

CNN Based Feature Extraction for Visual Speech Recognition in Malayalam
Shabina Bhaskar ... T M Thasleema
-
Shabina Bhaskar, et. al.Shabina Bhaskar ... T M Thasleema
22 Nov 2021
22 Nov 2021

Visual speech recognition for multiple languages in the wild
Pingchuan Ma ... Maja Pantic
Nature Machine Intelligence | VOL. 4
Pingchuan Ma, et. al.Pingchuan Ma ... Maja Pantic
24 Oct 2022
Nature Machine Intelligence | VOL. 4

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Visual Lip-Reading for Quranic Arabic Alphabets and Words Using Deep Learning

Abstract

Talk to us

Similar Papers

More From: Computer Systems Science and Engineering