Top-Down Predictions of Familiarity and Congruency in Audio-Visual Speech Perception at Neural Level.

Orsolya B Kolozsvári,Jarmo A Hämäläinen,Weiyong Xu,Paavo H T Leppänen

doi:10.3389/fnhum.2019.00243

Orsolya B Kolozsvári, Jarmo A Hämäläinen + Show 2 more

Open Access

https://doi.org/10.3389/fnhum.2019.00243

Copy DOI

Journal: Frontiers in human neuroscience	Publication Date: Jul 12, 2019
Citations: 4	License type: CC BY 4.0

Affiliation: University of Jyväskylä

Abstract

During speech perception, listeners rely on multimodal input and make use of both auditory and visual information. When presented with speech, for example syllables, the differences in brain responses to distinct stimuli are not, however, caused merely by the acoustic or visual features of the stimuli. The congruency of the auditory and visual information and the familiarity of a syllable, that is, whether it appears in the listener’s native language or not, also modulates brain responses. We investigated how the congruency and familiarity of the presented stimuli affect brain responses to audio-visual (AV) speech in 12 adult Finnish native speakers and 12 adult Chinese native speakers. They watched videos of a Chinese speaker pronouncing syllables (/pa/, /pha/, /ta/, /tha/, /fa/) during a magnetoencephalography (MEG) measurement where only /pa/ and /ta/ were part of Finnish phonology while all the stimuli were part of Chinese phonology. The stimuli were presented in audio-visual (congruent or incongruent), audio only, or visual only conditions. The brain responses were examined in five time-windows: 75–125, 150–200, 200–300, 300–400, and 400–600 ms. We found significant differences for the congruency comparison in the fourth time-window (300–400 ms) in both sensor and source level analysis. Larger responses were observed for the incongruent stimuli than for the congruent stimuli. For the familiarity comparisons no significant differences were found. The results are in line with earlier studies reporting on the modulation of brain responses for audio-visual congruency around 250–500 ms. This suggests a much stronger process for the general detection of a mismatch between predictions based on lip movements and the auditory signal than for the top-down modulation of brain responses based on phonological information.

Highlights

In most cases speech perception relies on the seamless interaction and integration of auditory and visual information
Using magnetoencephalography (MEG), we studied how the effects of congruency and familiarity of the auditory and visual features are reflected in brain activity
Averaged planar gradiometer data were transformed into combined planar gradients using the vector sum of the two orthogonal sensors at each position implemented in the Fieldtrip toolbox (Oostenveld et al, 2011), which were used for sensor-level analysis

Summary

Introduction

In most cases speech perception relies on the seamless interaction and integration of auditory and visual information. Auditory and visual cues can be presented either congruently or incongruently and this match or mismatch of features could be used to study the audio-visual processing of speech. Using magnetoencephalography (MEG), we studied how the effects of congruency and familiarity (i.e., whether the speech stimuli are part of the listener’s phonology or not) of the auditory and visual features are reflected in brain activity. The inferior parietal cortex has been shown to be activated at around 200 ms, which is suggested to be related to the connection of the STS to the inferior frontal lobe (Broca’s area) (Nishitani and Hari, 2002) with stronger activations in the left hemisphere than in the right (Capek et al, 2004; Campbell, 2008). This is followed by activation in the frontal areas close to Broca’s area around 250 ms (Nishitani and Hari, 2002)

Objectives

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Top-Down Predictions of Familiarity and Congruency in Audio-Visual Speech Perception at Neural Level.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in human neuroscience

Lead the way for us

Similar Papers

Audiovisual speech perception: A new approach and implications for clinical populations.
Julia Irwin ... Lori Diblasi
Language and Linguistics Compass | VOL. 11
Julia Irwin, et. al.Julia Irwin ... Lori Diblasi
01 Mar 2017
Language and Linguistics Compass | VOL. 11

Predicting Audiovisual Word Recognition in Noisy Situations: Toward Precision Audiology.
Joel Myerson ... Nancy Tye-Murray
Ear and hearing | VOL. 42
Joel Myerson, et. al.Joel Myerson ... Nancy Tye-Murray
27 Jul 2021
Ear and hearing | VOL. 42

A comparison of EEG encoding models using audiovisual stimuli and their unimodal counterparts.
Maansi Desai ... Liberty S Hamilton
PLoS computational biology | VOL. 20
Maansi Desai, et. al.Maansi Desai ... Liberty S Hamilton
09 Sep 2024
PLoS computational biology | VOL. 20

The influence of temporal (a)synchrony and linguistic complexity on audiovisual speech perception
Liesbeth Gijbels ... Adrian K C Lee
The Journal of the Acoustical Society of America | VOL. 154
Liesbeth Gijbels, et. al.Liesbeth Gijbels ... Adrian K C Lee
01 Oct 2023
The Journal of the Acoustical Society of America | VOL. 154

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Top-Down Predictions of Familiarity and Congruency in Audio-Visual Speech Perception at Neural Level.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in human neuroscience