Abstract
It is a need of time to build an automatic speech recognition (ASR) system for low-resource languages. India is a multilingual country with more than 5000 languages, and out of which 22 languages are official. The ASR systems for Indian languages are still in the infancy stage, mainly due to resource deficiency (e.g., lack of transcribed speech data, pronunciation lexicon, and text data). Deep neural networks (DNNs) have significantly improved the performance of the ASR system of low-resource languages. The merging data of various languages is a commonly used trend to train the multilingual DNN acoustic model. In multilingual training, hidden layers act as a global feature extractor. The multilingual ASR systems work better for similar kinds of sources and target languages. In this work, we use two low-resource Indian languages, namely Hindi and Marathi. Both languages are closely related and belong to the same Indo-Aryan family. Both languages include various common phonemes and words. Various state-of-the-art experiments were performed using different acoustic modeling and language modeling techniques. Experiments demonstrated that multilingual ASR systems consistently outperform monolingual ASR systems for both Hindi and Marathi languages.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.