Abstract

The ecology of human language is face-to-face interaction, comprising cues such as prosody, co-speech gestures and mouth movements. Yet, the multimodal context is usually stripped away in experiments as dominant paradigms focus on linguistic processing only. In two studies we presented video-clips of an actress producing naturalistic passages to participants while recording their electroencephalogram. We quantified multimodal cues (prosody, gestures, mouth movements) and measured their effect on a well-established electroencephalographic marker of processing load in comprehension (N400). We found that brain responses to words were affected by informativeness of co-occurring multimodal cues, indicating that comprehension relies on linguistic and non-linguistic cues. Moreover, they were affected by interactions between the multimodal cues, indicating that the impact of each cue dynamically changes based on the informativeness of other cues. Thus, results show that multimodal cues are integral to comprehension, hence, our theories must move beyond the limited focus on speech and linguistic processing.

Highlights

  • Language originated [1,2], is learnt [3,4,5] and is often used [6,7,8] in face-to-face contexts where comprehension takes advantage of both audition and vision

  • Words accompanied by a meaningful gesture showed a significantly less negative N400 and high surprisal words elicited a larger reduction of N400 amplitude when meaningful gestures were present, in comparison to low surprisal words

  • We found a significant negative main effect of beat gestures: words accompanied by beat gestures elicited a more negative N400

Read more

Summary

Introduction

Language originated [1,2], is learnt [3,4,5] and is often used [6,7,8] in face-to-face contexts where comprehension takes advantage of both audition and vision. In face-to-face contexts, linguistic information is accompanied by multimodal ‘non-linguistic’ cues like speech intonation (prosody), hand gestures and mouth movements. We need to understand to what extent the processing of multimodal cues is central in natural language processing (e.g. whether a cue is only used when the linguistic information is ambiguous, or in experimental tasks that force attention to it) Answering this question is necessary to properly frame theories because, if some multimodal cues (e.g. gesture or prosody) always contribute to processing, this would imply that our current focus mainly on linguistic information is too narrow, if not misleading. Previous studies indicate that at least when taken individually, multimodal cues interact with linguistic information in modulating the predictability of upcoming words.

Mary thought of
Results
Discussion
International Conference on Computational
An electrophysiological megastudy of spoken word
Proceedings of ELSNET Goes East and IMACS
Findings
Sixth International Conference on Language
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call