Attention and speech-processing related functional brain networks activated in a multi-speaker environment.

Brigitta Tóth,Gábor Orosz,Dávid Farkas,István Winkler,László Hunyadi,Gábor Urbán,Botond Hajdu,Beáta Tünde Szabó,Annamária Kovács,Orsolya Szalárdy,Lidia B Shestopalova,Claude Alain

doi:10.1371/journal.pone.0212754

Abstract

Human listeners can focus on one speech stream out of several concurrent ones. The present study aimed to assess the whole-brain functional networks underlying a) the process of focusing attention on a single speech stream vs. dividing attention between two streams and 2) speech processing on different time-scales and depth. Two spoken narratives were presented simultaneously while listeners were instructed to a) track and memorize the contents of a speech stream and b) detect the presence of numerals or syntactic violations in the same (“focused attended condition”) or in the parallel stream (“divided attended condition”). Speech content tracking was found to be associated with stronger connectivity in lower frequency bands (delta band- 0,5–4 Hz), whereas the detection tasks were linked with networks operating in the faster alpha (8–10 Hz) and beta (13–30 Hz) bands. These results suggest that the oscillation frequencies of the dominant brain networks during speech processing may be related to the duration of the time window within which information is integrated. We also found that focusing attention on a single speaker compared to dividing attention between two concurrent speakers was predominantly associated with connections involving the frontal cortices in the delta (0.5–4 Hz), alpha (8–10 Hz), and beta bands (13–30 Hz), whereas dividing attention between two parallel speech streams was linked with stronger connectivity involving the parietal cortices in the delta and beta frequency bands. Overall, connections strengthened by focused attention may reflect control over information selection, whereas connections strengthened by divided attention may reflect the need for maintaining two streams in parallel and the related control processes necessary for performing the tasks.

Highlights

In everyday life we can reliably follow one speech stream in a noisy multi-talker environment [1], understanding the brain’s machinery underlying this feat is one of the major challenges of auditory neuroscience
Listeners performed significantly better with the numeral than the syntactic violation detection task and in the focused than the divided condition
Because RTs were calculated from word onset, the RT difference between the two tasks may be biased by different delays from word onset for recognizing numerals compared to detecting a syntactic violation

Summary

Introduction

In everyday life we can reliably follow one speech stream in a noisy multi-talker environment [1], understanding the brain’s machinery underlying this feat is one of the major challenges of auditory neuroscience. This is due to the fact that solving this problem involves complex functions, such as auditory scene analysis [2], speech processing on multiple timescales [3], and selective attention [4,5,6,7,8]. Oscillations in different frequency ranges may serve functions such as neural segmentation and identification of various speech units [15,18,21,33,34]

Objectives

Methods

Results

Discussion

Conclusion