Automatic word recognition for bangla spoken language

Sara Binte Zinnat,Deen Md Abdullah,Md Imamul Hossain,Mohammad Nurul Huda,Razia Marzia Asheque Siddique

doi:10.1109/icspct.2014.6884886

Abstract

Automatic speech recognition (ASR) known as speech recognition is a computer technology that enables a device to recognize and understand spoken words by digitizing the sound and matching its patterns against the stored patterns. In this research a system is developed which is speaker as well as gender independent and can detect continuous speech. Other than the traditional Mel-Frequency Cepstral Coefficients (MFCC) triphone model, this proposed system increases the performance of ASR system by inventing new features named local feature (LF). In the experiments, MFCCs and LFs are inputted to the Hidden Markov Model (HMM) based classifiers for obtaining word recognition performance. From the experimental results, sentence correct rate, word correct rate and word accuracy for male and female voices distinctly provide much better result for LF-25 than MFCC-38 as well as MFCC-39. For male and female voices collectively, sometimes MFCC-39 based model and sometimes LF-25 based model shows better word accuracy and correct rate. Therefore, our proposed Bangla word recognition system, based on LF-25 is a new approach towards the field of Bangla ASR system.

Full Text