Abstract

How humans extract the identity of speech sounds from highly variable acoustic signals remains unclear. Here, we use searchlight representational similarity analysis (RSA) to localize and characterize neural representations of syllables at different levels of the hierarchically organized temporo-frontal pathways for speech perception. We asked participants to listen to spoken syllables that differed considerably in their surface acoustic form by changing speaker and degrading surface acoustics using noise-vocoding and sine wave synthesis while we recorded neural responses with functional magnetic resonance imaging. We found evidence for a graded hierarchy of abstraction across the brain. At the peak of the hierarchy, neural representations in somatomotor cortex encoded syllable identity but not surface acoustic form, at the base of the hierarchy, primary auditory cortex showed the reverse. In contrast, bilateral temporal cortex exhibited an intermediate response, encoding both syllable identity and the surface acoustic form of speech. Regions of somatomotor cortex associated with encoding syllable identity in perception were also engaged when producing the same syllables in a separate session. These findings are consistent with a hierarchical account of how variable acoustic signals are transformed into abstract representations of the identity of speech sounds.

Highlights

  • How do listeners perceive highly variable speech signals? No two naturally produced syllables are exactly alike since their precise acoustic realization varies both within and between speakers

  • We conducted a correlation analysis on beta values extracted from the peak of the inferior frontal and precentral gyrus cluster associated with degraded speech perception: activity within this cluster was correlated with individual differences in detection of repeated degraded syllables (r2 = 0.527, P = 0.001)

  • By comparing the degree to which multivoxel patterns in auditory, temporal, and somatomotor regions code for surface acoustic characteristics, we provide evidence that these regions sit at different levels of a processing hierarchy that maps the variable acoustic forms of speech signals to more abstract representations of syllable identity

Read more

Summary

Introduction

No two naturally produced syllables are exactly alike since their precise acoustic realization varies both within and between speakers Despite this variability, listeners are typically able to understand speech rapidly and accurately even when it has been significantly degraded (Remez et al 1981; Shannon et al 1995; Dupoux and Green 1997; Brungart 2001). Listeners are typically able to understand speech rapidly and accurately even when it has been significantly degraded (Remez et al 1981; Shannon et al 1995; Dupoux and Green 1997; Brungart 2001) These observations suggest that no single acoustic cue is necessary for correct perception of speech sounds. The nature of motor contributions to speech perception remains controversial (Lotto et al 2009; Scott et al 2009), and the neural mechanisms by which motor representations are accessed from speech remain underspecified

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call