Lip-Reading: Furhat Audio Visual Intelligibility of a Back Projected Animated Face

Samer Al Moubayed,Gabriel Skantze,Jonas Beskow

doi:10.1007/978-3-642-33197-8_20

Abstract

Back projecting a computer animated face, onto a three dimensional static physical model of a face, is a promising technology that is gaining ground as a solution to building situated, flexible and human-like robot heads. In this paper, we first briefly describe Furhat, a back projected robot head built for the purpose of multimodal multiparty human-machine interaction, and its benefits over virtual characters and robotic heads; and then motivate the need to investigating the contribution to speech intelligibility Furhat's face offers. We present an audio-visual speech intelligibility experiment, in which 10 subjects listened to short sentences with degraded speech signal. The experiment compares the gain in intelligibility between lip reading a face visualized on a 2D screen compared to a 3D back-projected face and from different viewing angles. The results show that the audio-visual speech intelligibility holds when the avatar is projected onto a static face model (in the case of Furhat), and even, rather surprisingly, exceeds it. This means that despite the movement limitations back projected animated face models bring about; their audio visual speech intelligibility is equal, or even higher, compared to the same models shown on flat displays. At the end of the paper we discuss several hypotheses on how to interpret the results, and motivate future investigations to better explore the characteristics of visual speech perception 3D projected faces.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Lip-Reading: Furhat Audio Visual Intelligibility of a Back Projected Animated Face

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Audiovisual Speech Enhancement via Cross-Modal Suppression of Auditory Association Cortex by Visual Speech
Patrick J Karas ... Brian A Metzger
Neurosurgery | VOL. 66
Patrick J Karas, et. al.Patrick J Karas ... Brian A Metzger
20 Aug 2019
Neurosurgery | VOL. 66

Neurophysiological Indices of Audiovisual Speech Processing Reveal a Hierarchy of Multisensory Integration Effects.
Aisling E O'Sullivan ... Alain De Cheveigné
The Journal of Neuroscience | VOL. 41
Aisling E O'Sullivan, et. al.Aisling E O'Sullivan ... Alain De Cheveigné
06 Apr 2021
The Journal of Neuroscience | VOL. 41

Review of Various Machine Learning and Deep Learning Techniques for Audio Visual Automatic Speech Recognition
Arpita Choudhury ... Pinki Roy
-
Arpita Choudhury, et. al.Arpita Choudhury ... Pinki Roy
03 Feb 2023
03 Feb 2023

Predicting binaural gain in intelligibility and release from masking for speech.
H Levitt ... L R Rabiner
The Journal of the Acoustical Society of America | VOL. 42
H Levitt, et. al.H Levitt ... L R Rabiner
01 Oct 1967
The Journal of the Acoustical Society of America | VOL. 42

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Lip-Reading: Furhat Audio Visual Intelligibility of a Back Projected Animated Face

Abstract

Talk to us

Similar Papers