Analysis of the Effect of Audio Data Augmentation Techniques on Phone Digit Recognition For Algerian Arabic Dialect

Khaled Lounnas,Mourad Abbas,Mohamed Lichouri

doi:10.1109/icaase56196.2022.9931574

Abstract

In this study, we describe a solution for dealing with the problem of data scarcity in Speech Processing tasks involving low-resource languages, including Automatic Speech Recognition (ASR). This method is based on a set of Data Augmentation (DA) techniques that will be applied to the small corpus that was initially used. This corpus comprises the first 100 Arabic digits uttered by two native Algerians. We used a variety of DA techniques to increase the size of this corpus, including stretching the signal without changing the pitch, simulating an environment using white noise, and finally shifting the sound. Finally, a number of experiments were carried out on two alternative configurations to assess the influence of these strategies on ASR performance. Extensive tests are carried out to verify the impact of the augmented samples in the training set or the training and testing set. Experimental results show that data augmentation plays an important role in improving the accuracy of recognition models, in which the impacts of the data augmentation methods such as Noise, Time Stretch, and rotation are slightly obvious.

Full Text