Abstract

As speech synthesis technology develops more advanced paralinguistic capabilities, open questions emerge regarding how humans perceive the use of such vocal capabilities by robots. Perceptions of spoken interaction are complex and influenced by multiple factors including the linguistic content of a message, social context, perceived intelligence of the agent, and form factor of its embodiment. This paper shares results from a study that controlled for the above factors in order to investigate the effect on human listeners of a male synthetic voice with an expressive range. Participants were randomly assigned to three conditions, counterbalancing for gender and language background, in which how paralinguistic cues were applied was varied. As the voice became more expressive and appropriate for the context, observers were more likely to describe the communication as effective, but were less likely to refer to the unseen agent as a person. Possible effects of the listener gender and cultural-linguistic background are examined. Implications for future methodologies in this field are discussed.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call