Abstract

The basic goal of developing the Phonetic Engine (PE) is to determine the sequence of basic sound units such as phones present in the spoken utterance. In this work, we are focusing on developing PEs for Indian languages namely Bengali and Oriya. This framework of developing PEs can be extended to any Indian languages. Two separate PEs for decoding the spoken utterances of Bengali and Oriya languages are developed. For developing the PEs, we have used read speech corpus. In this study, we have used 35 phones for developing PE for Bengali language and 32 phones for developing PE for Oriya language. We have developed PE by using Hidden Markov Models (HMMs) and FeedForward Neural Networks (FFNNs). Mel-frequency Cepstral Coefficients are used as features for building the models. In Speaker Dependent (SD) case, obtained accuracies of Bengali PE using HMMs and FFNNs are 41.65 and 53.87 percentages respectively for Speaker Dependent (SD) case. Likewise for Oriya PE, accuracies obtained using HMMs and FFNNs are 46.18 and 59.88 percentages respectively for SD case.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call