Abstract

Code-switching is a commonly occurring phenomenon in many multilingual communities, wherein a speaker switches between languages within a single utterance. Conventional Word Error Rate (WER) is not sufficient for measuring the performance of code-mixed languages due to ambiguities in transcription, misspellings and borrowing of words from two different writing systems. These rendering errors artificially inflate the WER of an Automated Speech Recognition (ASR) system and complicate its evaluation. Furthermore, these errors make it harder to accurately evaluate modeling errors originating from code-switched language and acoustic models. In this work, we propose the use of a new metric, transliteration-optimized Word Error Rate (toWER) that smoothes out many of these irregularities by mapping all text to one writing system and demonstrate a correlation with the amount of code-switching present in a language. We also present a novel approach to acoustic and language modeling for bilingual code-switched Indic languages using the same transliteration approach to normalize the data for three types of language models, namely, a conventional n-gram language model, a maximum entropy based language model and a Long Short Term Memory (LSTM) language model, and a state-of-the-art Connectionist Temporal Classification (CTC) acoustic model. We demonstrate the robustness of the proposed approach on several Indic languages from Google Voice Search traffic with significant gains in ASR performance up to 10% relative over the state-of-the-art baseline.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.