Abstract

The article describes the results of research on the development and testing of an automatic speech recognition system (SAR) in Arabic numerals using artificial neural networks. Sound recordings (speech signals) of the Arabic Yemeni dialect recorded in the Republic of Yemen were used for the research. SAR is an isolated system of recognition of whole words, it is implemented in two modes: "speaker-dependent system" (the same speakers are used for training and testing the system) and "speaker-independent system" (the speakers used for training the system differ from those used for testing it). In the process of speech recognition, the speech signal is cleared of noise using filters, then the signal is pre-localized, processed and analyzed by the Hamming window (a time alignment algorithm is used to compensate for differences in pronunciation). Informative features are extracted from the speech signal using mel-frequency cepstral coefficients. The developed SAR provides high accuracy of the recognition of Arabic numerals of the Yemeni dialect – 96.2 % (for a speaker-dependent system) and 98.8 % (for a speaker-independent system).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call