Performance Analysis of a HMM based Automatic Patient's Case History Generator in Bangla

Shah Yasser Aziz,Meherul Alam Sarker,Ashish Das,Sohel Rana Biplob,H M Arefin Hridoy,Md Jakaria Rahimi

doi:10.1109/ceeict.2018.8628096

Abstract

This paper looks at the feasibility of a Hidden Markov Model (HMM) based speech recognition system to serve as a Bangla transcription device for doctors, who will dictate the case history of patients. The experiments are performed using Hidden Markov Toolkit (HTK). The features used are the Mel Frequency Cepstral Coefficients (MFCC) of the audio signal, which 39 features. The audio data is collected from ten male speakers and the train-test split is 50–50. The system consists of a word parser program, followed by an isolated word recognizer. The word parser takes discretely spoken sentences and outputs word audios. Each word audio is inputted to the word recognizer and the output words are concatenated. Five experiments were repeated twice, due to some words performing poorly in the first run. So, in the second run, more training data was added for the low accuracy words. The final sentence recognition accuracy was 80% and for most words, the recognition accuracy is above 90%. In conclusion, HMM-based recognition systems are feasible for transcription devices.

Full Text