Abstract

This study explores the production and perception of word-final devoicing in German across text-to-speech (from technology used in common voice-AI “smart” speaker devices—specifically, voices from Apple and Amazon) and naturally produced utterances. First, the phonetic realization of word-final devoicing in German across text-to-speech (TTS) and naturally produced word productions was compared. Acoustic analyses reveal that the presence of cues to a word-final voicing contrast varied across speech types. Naturally produced words with phonologically voiced codas contain partial voicing, as well as longer vowels than words with voiceless codas. However, these distinctions are not present in TTS speech. Next, German listeners completed a forced-choice identification task, in which they heard the words and made coda consonant categorizations, in order to examine the intelligibility consequences of the word-final devoicing patterns across speech types. Intended coda identifications are higher for the naturally produced productions than for TTS. Moreover, listeners systematically misidentified voiced codas as voiceless in TTS words. Overall, this study extends previous literature on speech intelligibility at the intersection of speech synthesis and contrast neutralization. TTS voices tend to neutralize salient phonetic cues present in natural speech. Subsequently, listeners are less able to identify phonological distinctions in TTS. We also discuss how investigating which cues are more salient in natural speech can be beneficial in synthetic speech generation to make them more natural and also easier to perceive.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.