Abstract
Text-to-Speech (TTS) technologies have provided ways to produce acoustic approximations of human voices. However, recent advancements in machine learning (i.e., neural network TTS) have helped move beyond coarse mimicry and towards more natural-sounding speech. With only a small collection of recorded utterances, it is now possible to generate wholly synthetic voices indistinguishable from those of human speakers. While these new approaches to speech synthesis can help facilitate more seamless experiences with artificial agents, they also lower the barrier to entry for those seeking to perpetrate deception. As such, in the development of these technologies, it is important to anticipate potential harms and devise strategies to help mitigate against misuse. This paper presents findings from a 360-person survey that assessed public perceptions of synthetic voices, with a particular focus on how voice type and social scenarios impact ratings of trust. Findings have implications for the responsible deployment of synthetic speech technologies.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Proceedings of the Human Factors and Ergonomics Society Annual Meeting
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.