Abstract

This paper presents a mispronunciation detection system which uses automatic speech recognition to support computer-aided pronunciation training (CAPT). Our methodology extends a model pronunciation lexicon with possible phonetic mispronunciations that may appear in learners’ speech. Generation of these pronunciation variants was previously achieved by means of phone-tophone mapping rules derived from a cross-language phonological comparison between the primary language (L1, Cantonese) and secondary language (L2, American English). This rule-based generation process results in many implausible candidates of mispronunciation. We present a methodology that applies Viterbi decoding on learners’ speech using an HMM-based recognizer and the fully extended pronunciation dictionary. Word boundaries are thus identified and all pronunciation variants are scored and ranked based on Viterbi scores. Pruning is applied to keep the N-best pronunciation variants which are deemed plausible candidates for mispronunciation detection. Experiments based on the speech recordings from 21 Cantonese learners of English shows that the agreement between automatic mispronunciation detection and human judges is over 86%.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.