Abstract

Speech recognition is a process where an acoustic signal is converted to text or words or commands and recognizing the speech. In this paper, a Bangla numeral recognition system from the speech signal is developed utilizing Convolutional Neural Network (CNN). In the proposed system, a speech dataset of ten isolated Bangla digits has been developed consists of 6000 utterances (5 utterances for every 120 speakers) and a feature extraction procedure is performed to elicit significant features from the speech signals using Mel Frequency Cepstrum Coefficient (MFCC) analysis. Then, CNN is trained with the features of the speech signal as input. The efficiency of the proposed system is tested on the dataset developed for this purpose, and acquire 93.65% recognition accuracy. The proposed system is also compared with other existing methods of Bangla numeral speech recognition and outperforms most of the existing systems and proves the superiority of itself.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call