Analysis on mispronunciations in CAPT based on computational speech perception

Jia Jia,Ye Tian,Wai-Kim Leung,Lianhong Cai,Huadong Meng

doi:10.1109/iscslp.2012.6423530

Abstract

Computer-aided Pronunciation Training (CAPT) technologies enable the use of automatic speech recognition to detect mispronunciations in second language (L2) learners' speech. In order to further facilitate learning, we aim to be able to develop a principle-based method for generating a gradation of the severity of mispronunciations. This paper presents an approach towards gradation that is motivated by auditory perception. We have developed a computational method for generating a perceptual distance (PD) between two spoken phonemes. This is used to compute the distance between two phonemes of a target (L2) language. The PD is found to correlate well with the mispronunciations detected in CAPT system for Chinese learners of English, i.e. L1 being Chinese (Cantonese) and L2 being US English. These results indicate that auditory confusion indirectly reflects pronunciation confusions in L2 learning. The PD can also be used to help us grade the severity of errors (i.e. mispronunciations that confuse more distant phonemes are more severe) and accordingly prioritize the order of corrective feedback generated for the learners.

Full Text