The effect of task conditions on the comprehensibility of synthetic speech

Jennifer Lai,Michael Considine,David Wood

doi:10.1145/332040.332451

Abstract

A study was conducted with 78 subjects to evaluate the comprehensibility of synthetic speech for various tasks ranging from short, simple e-mail messages to longer news articles on mostly obscure topics. Comprehension accuracy for each subject was measured for synthetic speech and for recorded human speech. Half the subjects were allowed to take notes while listening, the other half were not. Findings show that there was no significant difference in comprehension of synthetic speech among the five different text-to-speech engines used. Those subjects that did not take notes performed significantly worse for all synthetic voice tasks when compared to recorded speech tasks. Performance for synthetic speech in the non note-taking condition degraded as the task got longer and more complex. When taking notes, subjects also did significantly worse within the synthetic voice condition averaged across all six tasks. However, average performance scores for the last three tasks in this condition show comparable results for human and synthetic speech, reflective of a training effect.

Full Text