Data Augmentation for Arabic Speech Recognition Based on End-to-End Deep Learning

Hamzah Alsayadi,Zaki Taha,Abdelaziz Abdelhamid,Islam Hegazy

doi:10.21608/ijicis.2021.73581.1086

Abstract

End-to-end deep learning approach has greatly enhanced the performance of speech recognition systems. With deep learning techniques, the overfitting stills the main problem with a little data. Data augmentation is a suitable solution for the overfitting problem, which is adopted to improve the quantity of training data and enhance robustness of the models. In this paper, we investigate data augmentation method for enhancing Arabic automatic speech recognition (ASR) based on end-to-end deep learning. Data augmentation is applied on original corpus for increasing training data by applying noise adaptation, pitch-shifting, and speed transformation. An CNN-LSTM and attention-based encoder-decoder method are included in building the acoustic model and decoding phase. This method is considered as state-of-art in end-to-end deep learning, and to the best of our knowledge, there is no prior research employed data augmentation for CNN-LSTM and attention-based model in Arabic ASR systems. In addition, the language model is built using RNN-LM and LSTM-LM methods. The Standard Arabic Single Speaker Corpus (SASSC) without diacritics is used as an original corpus. Experimental results show that applying data augmentation improved word error rate (WER) when compared with the same approach without data augmentation. The achieved average reduction in WER is 4.55%.

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Data Augmentation for Arabic Speech Recognition Based on End-to-End Deep Learning

Abstract

Talk to us

Similar Papers

More From: International Journal of Intelligent Computing and Information Sciences

Lead the way for us

Journal: International Journal of Intelligent Computing and Information Sciences	Publication Date: Jul 19, 2021
Citations: 9

Similar Papers

Arabic speech recognition using end‐to‐end deep learning
Hamzah A Alsayadi ... Zaki T Fayed
IET Signal Processing | VOL. 15
Hamzah A Alsayadi, et. al.Hamzah A Alsayadi ... Zaki T Fayed
02 Jun 2021
IET Signal Processing | VOL. 15

Non-diacritized Arabic speech recognition based on CNN-LSTM and attention-based models
Hamzah A Alsayadi ... Zaki T Fayed
Journal of Intelligent & Fuzzy Systems | VOL. 41
Hamzah A Alsayadi, et. al.Hamzah A Alsayadi ... Zaki T Fayed
16 Dec 2021
Journal of Intelligent & Fuzzy Systems | VOL. 41

Dialectal Arabic Speech Recognition using CNN-LSTM Based on End-to-End Deep Learning
Hamzah A Alsayadi ... Salah Al-Hagree
-
Hamzah A Alsayadi, et. al.Hamzah A Alsayadi ... Salah Al-Hagree
25 Oct 2022
25 Oct 2022

Investigating the effects of gender, dialect, and training size on the performance of Arabic speech recognition
Eiman Alsharhan ... Allan Ramsay
Language Resources and Evaluation | VOL. 54
Eiman Alsharhan, et. al.Eiman Alsharhan ... Allan Ramsay
12 Oct 2020
Language Resources and Evaluation | VOL. 54

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Data Augmentation for Arabic Speech Recognition Based on End-to-End Deep Learning

Abstract

Talk to us

Similar Papers

More From: International Journal of Intelligent Computing and Information Sciences