Abstract

Speech recognition is a process where an acoustic signal is converted to text or words or commands and recognizing the speech. In this paper, a Bangla numeral recognition system from the speech signal is developed utilizing Convolutional Neural Network (CNN). In the proposed system, a speech dataset of ten isolated Bangla digits has been developed consists of 6000 utterances (5 utterances for every 120 speakers) and a feature extraction procedure is performed to elicit significant features from the speech signals using Mel Frequency Cepstrum Coefficient (MFCC) analysis. Then, CNN is trained with the features of the speech signal as input. The efficiency of the proposed system is tested on the dataset developed for this purpose, and acquire 93.65% recognition accuracy. The proposed system is also compared with other existing methods of Bangla numeral speech recognition and outperforms most of the existing systems and proves the superiority of itself.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.