Abstract

This paper addresses non-native accent issues in large vocabulary continuous speech recognition. We propose to analyze the transformation rules of non-native Mandarin speech spoken by native speakers of Naxi and Dai in Yunnan at the level of initials and finals. Firstly, baseline HMM models are trained using the project 863' standard Mandarin corpus to test their performance on non-native speech recognition. Secondly, the non-native speech data is transcribed, based on the baseline HMM models. In more detail, we analyze the error recognition rates of all initials and all finals, and their typical substitute error. The results obtained from our experiments might be useful for adapting a native speaker ASR system to model non-native accented data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call