Abstract

Reproducing basic human abilities has always been the main purpose for Artificial Intelligence (AI) systems. Since speech is essential to people’s communication, AI was applied to this major field to achieve Automatic Speech Recognition (ASR). In this paper, we focus on the inception model as a solution for Arabic speech recognition, due to its remarkable results on image classification tasks. We adapted this model for ASR problems and tried it on a dataset of spoken Arabic digits collected from social media apps and published corpora which resulted in more than 54000 utterances. A comparison between the proposed model and a traditional Convolutional Neural Network (CNN) shows the superiority of the inception model in ASR tasks. The inception model achieved 99.70% accuracy on the training dataset which is far better than the traditional CNN that achieved 87.46% on the same set, it did also great performance on the test subset with 88.96% accuracy compared to the traditional model with 84.78% recognition rate.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.