Abstract

Speech perception is influenced by vision through a process of audiovisual integration. This is demonstrated by the McGurk illusion where visual speech (for example /ga/) dubbed with incongruent auditory speech (such as /ba/) leads to a modified auditory percept (/da/). Recent studies have indicated that perception of the incongruent speech stimuli used in McGurk paradigms involves mechanisms of both general and audiovisual speech specific mismatch processing and that general mismatch processing modulates induced theta-band (4–8 Hz) oscillations. Here, we investigated whether the theta modulation merely reflects mismatch processing or, alternatively, audiovisual integration of speech. We used electroencephalographic recordings from two previously published studies using audiovisual sine-wave speech (SWS), a spectrally degraded speech signal sounding nonsensical to naïve perceivers but perceived as speech by informed subjects. Earlier studies have shown that informed, but not naïve subjects integrate SWS phonetically with visual speech. In an N1/P2 event-related potential paradigm, we found a significant difference in theta-band activity between informed and naïve perceivers of audiovisual speech, suggesting that audiovisual integration modulates induced theta-band oscillations. In a McGurk mismatch negativity paradigm (MMN) where infrequent McGurk stimuli were embedded in a sequence of frequent audio-visually congruent stimuli we found no difference between congruent and McGurk stimuli. The infrequent stimuli in this paradigm are violating both the general prediction of stimulus content, and that of audiovisual congruence. Hence, we found no support for the hypothesis that audiovisual mismatch modulates induced theta-band oscillations. We also did not find any effects of audiovisual integration in the MMN paradigm, possibly due to the experimental design.

Highlights

  • Speech is perceived with both audition and vision

  • The integration hypothesis is concerned with audiovisual integration of phonetic features, which occurs in successful McGurk fusions [30] and for congruent audiovisual stimuli but only when sinewave speech (SWS) is perceived as speech

  • For the audiovisual congruent (AVC) stimuli, the speech mode (SM) vs nonspeech mode (NSM) difference was significant (p = 0.0200, see Fig 1), but no significant differences were observed for the auditory, visual, and audiovisual incongruent (A, V and AVI) conditions

Read more

Summary

Introduction

Seeing the face of the speaker improves comprehension, if the auditory signal is weak or degraded [1], and speeds up the neural processing of speech [2]. The McGurk effect–where dubbing an auditory syllable onto an incongruent speech video leads to a modified auditory percept Auditory /ba/ and visual /ga/ leading to the perception of /da/)–is a striking behavioural demonstration of audiovisual (AV) integration in speech perception [3]. More recently it has been argued that the perceptual fusion of incongruent audiovisual stimuli is different from that of congruent, naturally occurring speech, as it requires incongruence processing in addition to the mechanism of audiovisual integration [9,10]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.