Abstract

Multisensory integration influences emotional perception, as the McGurk effect demonstrates for the communication between humans. Human physiology implicitly links the production of visual features with other modes like the audio channel: Face muscles responsible for a smiling face also stretch the vocal cords that results in a characteristic smiling voice. For artificial agents capable of multimodal expression, this linkage is modeled explicitly. In our study, we observe the influence of visual and audio channel on the perception of the agent’s emotional state. We created two virtual characters to control for anthropomorphic appearance. We record videos of these agents either with matching or mismatching emotional expression in the audio and visual channel. In an online study we measured the agent’s perceived valence and arousal. Our results show that a matched smiling voice and smiling face increase both dimensions of the Circumplex model of emotions: ratings of valence and arousal grow. When the channels present conflicting information, any type of smiling results in higher arousal rating, but only the visual channel increases the perceived valence. When engineers are constrained in their design choices, we suggest they should give precedence to convey the artificial agent’s emotional state through the visual channel.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call