Speech through ears and eyes: interfacing the senses with the supramodal brain

Virginie Van Wassenhove

doi:10.3389/fpsyg.2013.00388

Abstract

The comprehension of auditory-visual (AV) speech integration has greatly benefited from recent advances in neurosciences and multisensory research. AV speech integration raises numerous questions relevant to the computational rules needed for binding information (within and across sensory modalities), the representational format in which speech information is encoded in the brain (e.g., auditory vs. articulatory), or how AV speech ultimately interfaces with the linguistic system. The following non-exhaustive review provides a set of empirical findings and theoretical questions that have fed the original proposal for predictive coding in AV speech processing. More recently, predictive coding has pervaded many fields of inquiries and positively reinforced the need to refine the notion of internal models in the brain together with their implications for the interpretation of neural activity recorded with various neuroimaging techniques. However, it is argued here that the strength of predictive coding frameworks reside in the specificity of the generative internal models not in their generality; specifically, internal models come with a set of rules applied on particular representational formats themselves depending on the levels and the network structure at which predictive operations occur. As such, predictive coding in AV speech owes to specify the level(s) and the kinds of internal predictions that are necessary to account for the perceptual benefits or illusions observed in the field. Among those specifications, the actual content of a prediction comes first and foremost, followed by the representational granularity of that prediction in time. This review specifically presents a focused discussion on these issues.

Highlights

In natural conversational settings, watching an interlocutor’s face does not solely provide information about the speaker’s identity or emotional state: the kinematics of the face articulating speech can robustly influence the processing and comprehension of auditory speech
Advances in multisensory research has raised core issues: how early do multisensory integration occur during perceptual processing (Talsma et al, 2010)? In which representational format do sensory modalities interface for supramodal (Pascual-Leone and Hamilton, 2001; Voss and Zatorre, 2012) and speech analysis (Summerfield, 1987; Altieri et al, 2011)? Which neuroanatomical pathways are implicated (Calvert and Thesen, 2004; Ghazanfar and Schroeder, 2006; Driver and Noesselt, 2008; Murray and Spierer, 2011)? In Humans, visual speech plays an important role in social interactions and, and crucially, interfaces with the language system at various depth of linguistic processing (e.g., McGurk and MacDonald, 1976; Auer, 2002; Brancazio, 2004; Campbell, 2008)
This review focuses on the specificities of AV speech not on the general guiding principles of multisensory (AV) integration

Summary

Introduction

In natural conversational settings, watching an interlocutor’s face does not solely provide information about the speaker’s identity or emotional state: the kinematics of the face articulating speech can robustly influence the processing and comprehension of auditory speech. The robustness and principled ways in which visual speech influences auditory speech processing suggest that the neural underpinnings of AV speech integration rely on specific computational mechanisms that are constrained by the internal rules of the speech processing system—and possibly modulated by attentional focus on one or the other streams of information.

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Frontier in Psychology	Publication Date: Jan 1, 2013
Citations: 72	License type: cc-by

R Discovery Prime

R Discovery Prime

Speech through ears and eyes: interfacing the senses with the supramodal brain

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontier in Psychology

Lead the way for us

Similar Papers

Using EEG and stimulus context to probe the modelling of auditory-visual speech
Tim Paris ... Chris Davis
Cortex | VOL. 75
Tim Paris, et. al.Tim Paris ... Chris Davis
17 Apr 2015
Cortex | VOL. 75

Audio-visual speech perception in infants and toddlers with Down syndrome, fragile X syndrome, and Williams syndrome
Dean D’Souza ... Annette Karmiloff-Smith
Infant Behavior and Development | VOL. 44
Dean D’Souza, et. al.Dean D’Souza ... Annette Karmiloff-Smith
01 Aug 2016
Infant Behavior and Development | VOL. 44

The influence of auditory-visual speech and clear speech on cross-language perceptual assimilation
Sarah E Fenwick ... Michael D Tyler
Speech Communication | VOL. 92
Sarah E Fenwick, et. al.Sarah E Fenwick ... Michael D Tyler
15 Jun 2017
Speech Communication | VOL. 92

Toward a Model of Auditory-Visual Speech Intelligibility
Ken W Grant ... Joshua G W Bernstein
-
Ken W Grant, et. al.Ken W Grant ... Joshua G W Bernstein
01 Jan 2019
01 Jan 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Speech through ears and eyes: interfacing the senses with the supramodal brain

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontier in Psychology