Abstract

Text-to-speech (TTS) systems provide fundamental reading support for people with aphasia and reading difficulties. However, artificial voices are more difficult to process than natural voices. The current study is an extended analysis of the results of a clinical experiment investigating which, among three artificial voices and a digitised human voice, is more suitable for people with aphasia and reading impairments. Such results show that the voice synthesised with Ogmios TTS, a concatenative speech synthesis system, caused significantly slower reaction times than the other three voices used in the experiment. The present study explores whether and what voice quality metrics are linked to delayed reaction times. For this purpose, the voices were analysed using an automatic assessment of intelligibility, naturalness, and jitter and shimmer voice quality parameters. This analysis revealed that Ogmios TTS, in general, performed worse than the other voices in all parameters. These observations could explain the significantly delayed reaction times in people with aphasia and reading impairments when listening to Ogmios TTS and could open up consideration about which TTS to choose for compensative devices for these patients based on the voice analysis of these parameters.

Highlights

  • From a literature review focused on what devices are available for Patients with aphasia (PWA) and acquired reading impairments, the results showed no research on devices built and designed for this specific population [4]

  • The analysis of the reaction time (RT) obtained during the clinical experiment showed that Ogmios

  • We highlighted Ogmios synthesis sentences that had the worst RTs to emphasise that they were the majority compared with the rest of the TTS systems and the digitised human voice (DHV)

Read more

Summary

Introduction

A review of the literature on research exploring ways to allow or facilitate comprehension of the written text in PWA and acquired reading impairments showed that this population experiences some improvements in comprehension when resorting to combined modality [1,2,3], meaning the combination of two different input modalities (e.g., auditory and written text) to access texts’ content. PWA might struggle to comprehend artificial voices [5,6] Both types of support use computer-generated voices for their output voices. In this regard, we carried out a clinical experiment to identify which artificial voice is best suited for PWA and acquired reading impairments. Reaction time is a parameter of cognitive load and may reflect difficulties and, effort in processing information [7,8]

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call