Abstract

Automatic speaking assessment methods are essential for helping non-native learners to learn native pronunciation. The automated speaking assessment method consists of mispronunciation detection and pronunciation quality assessment. In the past, researchers have usually focused their research on only one specific aspect of the speaking assessment task. Research on multifaceted speaking tasks has been rare, and model building has often led to reduced performance due to the omission of local feature details. In this paper, we propose a multi-width band (MB) method and apply it to the Conformer model. This method can effectively increase the ability of the model to obtain local feature information at different scales. At the same time, we used a multi-task learning approach to train a multifaceted speaking assessment model based on GOP features. We conducted experiments on a self-built monosyllabic Mandarin mispronunciation detection dataset (PSC-MonoSyllable) and an English open-source pronunciation quality assessment dataset (SpeechOcean762), respectively. The experimental results show that the method’s mispronunciation detection metrics in terms of phonemes, tones, and words on the PSC-MonoSyllable dataset (F1 scores) reached 70.18%, 80.06%, and 79.82%, respectively. The results of the method on the SpeechOcean 762 dataset for the pronunciation quality assessment task also showed a certain degree of improvement in all aspects of the phoneme- and grapheme-level correlation metrics compared with the baseline model.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.