Abstract

In this paper, we describe the design and development of HMM-based speech recognition system for the Mongolian language. Mongolian language is one of the with low resources languages for speech processing area. To build a Large Vocabulary Continuous Speech Recognition (LVCSR) system, high accurate acoustic models and large-scale language models are essential. There were no Mongolian speech database and text corpus for use in study. First, we collected text corpus. The text is selected from television programs, newspapers and web. Selection criterion was to cover as many different subjects as possible. In speech data, the most frequent words are selected from the text corpus. We are training the acoustic and language models based on Hidden Markov Models (HMMs). We evaluated the performance of isolated word recognition with context independent and context dependent models.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.