Pronunciation error detection using DNN articulatory model based on multi-lingual and multi-task learning

Richeng Duan,Jinsong Zhang,Masatake Dantsujii,Tatsuya Kawahara

doi:10.1109/iscslp.2016.7918389

Abstract

Aiming at detecting pronunciation errors produced by second language learners and providing corrective feedbacks related with articulation, we address effective articulatory models based on deep neural network (DNN). Articulatory attributes are defined for manner and place of articulation. In order to efficiently train these models of non-native speech without using such data, which is difficult to collect in a large scale, we propose a multi-lingual learning method, in which the speech database of the target language (L2) and the native language (L1) of the learners are combined. We also investigate multi-task learning methods by tuning the weights of the secondary task. These methods are applied to Mandarin Chinese pronunciation learning by Japanese native speakers. Effects of the multi-lingual and multi-task learning methods are confirmed in the attribute classification and pronunciation error detection.

Full Text