Abstract

The neural substrates by which speech sounds are perceptually segregated into distinct streams are poorly understood. Here, we recorded high-density scalp event-related potentials (ERPs) while participants were presented with a cyclic pattern of three vowel sounds (/ee/-/ae/-/ee/). Each trial consisted of an adaptation sequence, which could have either a small, intermediate, or large difference in first formant (Δf1) as well as a test sequence, in which Δf1 was always intermediate. For the adaptation sequence, participants tended to hear two streams (“streaming”) when Δf1 was intermediate or large compared to when it was small. For the test sequence, in which Δf1 was always intermediate, the pattern was usually reversed, with participants hearing a single stream with increasing Δf1 in the adaptation sequences. During the adaptation sequence, Δf1-related brain activity was found between 100–250 ms after the /ae/ vowel over fronto-central and left temporal areas, consistent with generation in auditory cortex. For the test sequence, prior stimulus modulated ERP amplitude between 20–150 ms over left fronto-central scalp region. Our results demonstrate that the proximity of formants between adjacent vowels is an important factor in the perceptual organization of speech, and reveal a widely distributed neural network supporting perceptual grouping of speech sounds.

Highlights

  • Animal studies and neuroimaging research in humans suggest that auditory stream segregation involves a widely distributed neural network that comprises brainstem, midbrain, primary and secondary auditory cortices as well as the inferior parietal lobule (IPL)[9,10,11,12,13,14,15,16,17]

  • While the perceptual organization of speech sounds likely involves brain areas similar to those described for pure tones, one may posit that the perceptual grouping of speech would engage more left-lateralized brain areas than those typically involved in grouping pure tone stimuli

  • There was a difference in perception at test based on which Δf​1 was presented at adaptation; participants were significantly less likely to report hearing two streams with increasing Δf​1 in the adaptation sequences (F(2,30) = 20.362, p < 0.001, ƞp​2 = 0.576, all pair-wise comparison p < 0.05; linear trend: F(1,15) = 26.685, p < 0.001, ƞp2 = 0.640)

Read more

Summary

Introduction

Animal studies and neuroimaging research in humans suggest that auditory stream segregation involves a widely distributed neural network that comprises brainstem, midbrain, primary and secondary auditory cortices as well as the inferior parietal lobule (IPL)[9,10,11,12,13,14,15,16,17]. Scalp recordings of event-related potentials (ERPs) revealed an increase in sensory evoked response as a function of frequency separation, which occurs at about 100–300 ms after sound onset over the frontocentral scalp region and right temporal areas[23] These ERP modulations appear to index a relatively automatic process as it is present when participants are not actively paying attention to the stimuli[23]. Studies using fMRI have observed stimulus-driven effects[25] as well as perceptual-related changes in IPL activity[12], when participants reported hearing one versus two streams Together these studies suggest that the perception of concurrent sound streams is associated with activity in auditory cortices and inferior parietal cortex. As in Snyder et al.[27], we predicted that neural correlates reflecting the processing of Δf​1 frequency between adjacent vowels would differ from those related to the perception of concurrent streams of speech sounds

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call