Abstract

We build a single automatic speech recognition (ASR) model for several south Indian languages using a common set of intermediary labels, which can be easily mapped to the desired native script through simple lookup tables and a few rules. We use Sanskrit Library Phonetic encoding as the labeling scheme, which exploits the similarity in pronunciation across character sets of multiple Indian languages. Unlike the general approaches, which leverage common label sets only for multilingual acoustic modeling, we also explore multilingual language modeling. Our unified model improves the ASR performance in languages with limited amounts of speech data and also in out-of-domain test conditions. Also, the model performs reasonably well in languages with good representation in the training data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.