Abstract

Integration of multimodal sensory information is fundamental to many aspects of human behavior, but the neural mechanisms underlying these processes remain mysterious. For example, during face-to-face communication, we know that the brain integrates dynamic auditory and visual inputs, but we do not yet understand where and how such integration mechanisms support speech comprehension. Here, we quantify representational interactions between dynamic audio and visual speech signals and show that different brain regions exhibit different types of representational interaction. With a novel information theoretic measure, we found that theta (3–7 Hz) oscillations in the posterior superior temporal gyrus/sulcus (pSTG/S) represent auditory and visual inputs redundantly (i.e., represent common features of the two), whereas the same oscillations in left motor and inferior temporal cortex represent the inputs synergistically (i.e., the instantaneous relationship between audio and visual inputs is also represented). Importantly, redundant coding in the left pSTG/S and synergistic coding in the left motor cortex predict behavior—i.e., speech comprehension performance. Our findings therefore demonstrate that processes classically described as integration can have different statistical properties and may reflect distinct mechanisms that occur in different brain regions to support audiovisual speech comprehension.

Highlights

  • While engaged in a conversation, we effortlessly integrate auditory and visual speech information into a unified perception

  • Combining different sources of information is fundamental to many aspects of behavior, from our ability to pick up a ringing mobile phone to communicating with a friend in a busy environment

  • We have studied the integration of auditory and visual speech information

Read more

Summary

Introduction

While engaged in a conversation, we effortlessly integrate auditory and visual speech information into a unified perception. The superior temporal gyrus/sulcus (STG/S) responds to integration of auditory and visual stimuli, and its disruption leads to reduced McGurk fusion [10,11,12,13,14]. Their experimental designs typically contrasted two conditions: unisensory (i.e., audio or visual cues) and multisensory (congruent or incongruent audio and visual cues). A second shortcoming is that previous studies typically investigated (changes of) regional activation and not information integration between audiovisual stimuli and brain signals. We used a novel methodology (speech-brain entrainment) and novel information theoretic measures (the partial information decomposition [PID] [15]) to quantify the interactions between audiovisual stimuli and dynamic brain signals

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call