Abstract

Human speech can be comprehended using only auditory information from the talker's voice. However, comprehension is improved if the talker's face is visible, especially if the auditory information is degraded as occurs in noisy environments or with hearing loss. We explored the neural substrates of audiovisual speech perception using electrocorticography, direct recording of neural activity using electrodes implanted on the cortical surface. We observed a double dissociation in the responses to audiovisual speech with clear and noisy auditory component within the superior temporal gyrus (STG), a region long known to be important for speech perception. Anterior STG showed greater neural activity to audiovisual speech with clear auditory component, whereas posterior STG showed similar or greater neural activity to audiovisual speech in which the speech was replaced with speech-like noise. A distinct border between the two response patterns was observed, demarcated by a landmark corresponding to the posterior margin of Heschl's gyrus. To further investigate the computational roles of both regions, we considered Bayesian models of multisensory integration, which predict that combining the independent sources of information available from different modalities should reduce variability in the neural responses. We tested this prediction by measuring the variability of the neural responses to single audiovisual words. Posterior STG showed smaller variability than anterior STG during presentation of audiovisual speech with noisy auditory component. Taken together, these results suggest that posterior STG but not anterior STG is important for multisensory integration of noisy auditory and visual speech.

Highlights

  • Human speech perception is multisensory, combining auditory information from the talker’s voice with visual information from the talker’s face

  • We measured activity in electrodes implanted over the superior temporal gyrus (STG), a key brain area for speech perception (Mesgarani, Cheung, Johnson, & Chang, 2014; Binder et al, 2000), as participants were presented with audiovisual speech with either clear or noisy auditory or visual components

  • Using the posterior border of Heschl’s gyrus as an anatomical landmark, 16 of these electrodes were located over anterior STG and 11 electrodes were located over posterior STG (Figure 1A)

Read more

Summary

Introduction

Human speech perception is multisensory, combining auditory information from the talker’s voice with visual information from the talker’s face. The most popular technique for measuring human brain activity, BOLD fMRI, is an indirect measure of neural activity with a temporal resolution on the order of seconds, making it difficult to accurately measure rapidly changing neural responses to speech. To overcome this limitation, we recorded from the brains of participants implanted with electrodes for the treatment of epilepsy. We measured activity in electrodes implanted over the superior temporal gyrus (STG), a key brain area for speech perception (Mesgarani, Cheung, Johnson, & Chang, 2014; Binder et al, 2000), as participants were presented with audiovisual speech with either clear or noisy auditory or visual components

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call