Evaluation of English Speech Recognition for Japanese Learners Using DNN-Based Acoustic Models

Jiang Fu,Akinori Ito,Yuya Chiba,Takashi Nose

doi:10.1007/978-3-030-03748-2_11

Abstract

Regarding the assistance of computer-assisted language learning (CALL) systems to make foreign language learning easier, it is necessary to recognize the utterances of the learner with high accuracy. The quality of CALL systems mainly depends on the accuracy of automatic speech recognition (ASR). However, since the pronunciation of non-native speakers is greatly different from that of native speakers, existing ASR system cannot well recognize speech accurately. To solve this problem, this research projects an acoustic model based on deep neural networks (DNN), which is trained by using ERJ (English Read by Japanese) database collected from 202 Japanese learners. Compared with traditional ASR systems, this new system significantly promotes the speech recognition accuracy.

Full Text