Abstract
Automatic speech recognition (ASR) is a field of research that focuses on the ability of computers to process and interpret speech feedback from humans and to provide the highest degree of accuracy in recognition. Speech is one of the simplest ways to convey a message in a basic context, and ASR refers to the ability of machines to process and accept speech data from humans with the greatest degree of accuracy. As the human-to-machine interface continues to evolve, speech recognition is expected to become increasingly important. However, the Arabic language has distinct features that set it apart from other languages, such as the dialect and the pronunciation of words. Until now, insufficient attention has been devoted to continuous Arabic speech recognition research for independent speakers with a limited database. This research proposed two techniques for the recognition of Arabic speech. The first uses a combination of convolutional neural network (CNN) and long short-term memory (LSTM) encoders, and an attention-based decoder, and the second is based on the Sphinx-4 recognizer, which includes pocket sphinx, base sphinx, and sphinx train, with various types and number of features to be extracted (filter bank and mel frequency cepstral coefficients (MFCC)) based on the CMU Sphinx tool, which generates a language model for different sentences spoken by different speakers. These approaches were tested on a dataset containing 7 hours of spoken Arabic from 11 Arab countries, covering the Levant, Gulf, and African regions, which make up the Arab world, and achieved promising results. CNN-LSTM achieved a word error rate (WER) of 3.63% using 120 features for filter bank and 4.04% WER using 39 features for MFCC, respectively, while the Sphinx-4 recognizer technique achieved 8.17% WER and an accuracy of 91.83% using 25 features for MFCC and 8 Gaussian mixtures, respectively, when tested on the same benchmark dataset.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.