Predicting synthetic voice style from facial expressions. An application for augmented conversations

Éva Székely,Zeeshan Ahmed,Shannon Hennig,João P Cabral,Julie Carson-Berndsen

doi:10.1016/j.specom.2013.09.003

Abstract

The ability to efficiently facilitate social interaction and emotional expression is an important, yet unmet requirement for speech generating devices aimed at individuals with speech impairment. Using gestures such as facial expressions to control aspects of expressive synthetic speech could contribute to an improved communication experience for both the user of the device and the conversation partner. For this purpose, a mapping model between facial expressions and speech is needed, that is high level (utterance-based), versatile and personalisable. In the mapping developed in this work, visual and auditory modalities are connected based on the intended emotional salience of a message: the intensity of facial expressions of the user to the emotional intensity of the synthetic speech. The mapping model has been implemented in a system called WinkTalk that uses estimated facial expression categories and their intensity values to automatically select between three expressive synthetic voices reflecting three degrees of emotional intensity. An evaluation is conducted through an interactive experiment using simulated augmented conversations. The results have shown that automatic control of synthetic speech through facial expressions is fast, non-intrusive, sufficiently accurate and supports the user to feel more involved in the conversation. It can be concluded that the system has the potential to facilitate a more efficient communication process between user and listener.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Predicting synthetic voice style from facial expressions. An application for augmented conversations

Abstract

Talk to us

Similar Papers

More From: Speech Communication

Lead the way for us

Journal: Speech Communication	Publication Date: Sep 24, 2013
Citations: 2

Similar Papers

Facial Expression During Emotional Monologues in Unilateral Stroke: An Analysis of Monologue Segments
Seta Kazandjian ... Adam M Brickman
Applied Neuropsychology | VOL. 14
Seta Kazandjian, et. al.Seta Kazandjian ... Adam M Brickman
06 Dec 2007
Applied Neuropsychology | VOL. 14

Stimulus arousal drives amygdalar responses to emotional expressions across sensory modalities
Huiyan Lin ... Bettina Gathmann
Scientific Reports | VOL. 10
Huiyan Lin, et. al.Huiyan Lin ... Bettina Gathmann
05 Feb 2020
Scientific Reports | VOL. 10

Face-body integration of intense emotional expressions of victory and defeat.
Lili Wang ... Dandan Zhang
PloS one | VOL. 12
Lili Wang, et. al.Lili Wang ... Dandan Zhang
28 Feb 2017
PloS one | VOL. 12

Simulated proximity enhances perceptual and physiological responses to emotional facial expressions
Olena V Bogdanova ... Volodymyr B Bogdanov
Scientific Reports | VOL. 12
Olena V Bogdanova, et. al.Olena V Bogdanova ... Volodymyr B Bogdanov
07 Jan 2022
Scientific Reports | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Predicting synthetic voice style from facial expressions. An application for augmented conversations

Abstract

Talk to us

Similar Papers

More From: Speech Communication