The relationship between word error rate and perceptual judgment

Seongjin Park,John Culnan

doi:10.1121/1.5147687

Abstract

The aim of the present study is to examine the relationship between word error rate (WER) from an automatic speech recognition system and perceptual judgments (foreign-accentedness, fluency, and comprehensibility) from human raters. In a previous study, Franco et al. (1997) used HMM-derived scores based on posterior probabilities of phone segments, and Deville et al. (1999) used an HMM/ANN recognition approach to show how the results of automatic speech recognition can be used for perceptual judgments. Park and Culnan (2019) showed the possibility to assimilate human raters' perceptual judgments by using neural network models only with the speech signal, and suggested that the model worked better on accentedness judgments than fluency judgments. In this study, we will examine whether WERs of English sentences produced by three language groups (American, Korean, Chinese) are significantly different, and if there is any difference, we will analyze the correlations between WER and perceptual judgments. The perceptual data used in Park and Culnan (2019) will be used for the analysis. The preliminary results of this study will be used to find important features to build more accurate automatic proficiency judgment models.

Full Text