Abstract

A Voice Oriented Interactive Computing Environment (VOICE) has been implemented in the Hindi language. The system provides in interactive facility for visual and voice feedback. The 200 isolated word recognition system is designed around a railway reservation enquiry task and uses acoustic-phonetic segments as the basic units of recognition. Frame level classification into broad acoustic-phonetic categories is accomplished by a maximum likelihood classifier and segmentation by hierarchical clustering of the frame level likelihood vectors by use of explicit duration semi (Hidden) Markov Models. A more detailed classification of a few categories (vowels, voice bar and nasals in the first instance) is performed by neural nets. String matching using dynamic programming accomplishes lexical access, or conversion of the phonetic category symbol strings into words. Distributed processing of the word recognition task enables recognition at four times real time. A language processor disambiguates between multiple choices given by the recognizer for each word and even corrects some acoustic level recognition errors. This, the first system working in any Indian language, gives a recognition performance of 85% at the word level. For comparison, a purely HMM based word level recognizer has also been implemented. The performance is expected to improve further as there is still substantial scope for refinement.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.