Abstract
Speech is a complex and ambiguous acoustic signal that varies significantly within and across speakers. Despite the processing challenge that such variability poses, humans adapt to systematic variations in pronunciation rapidly. The goal of this study is to uncover the neurobiological bases of the attunement process that enables such fluent comprehension. Twenty-four native English participants listened to words spoken by a “canonical” American speaker and two non-canonical speakers, and performed a word-picture matching task, while magnetoencephalography was recorded. Non-canonical speech was created by including systematic phonological substitutions within the word (e.g. [s] → [sh]). Activity in the auditory cortex (superior temporal gyrus) was greater in response to substituted phonemes, and, critically, this was not attenuated by exposure. By contrast, prefrontal regions showed an interaction between the presence of a substitution and the amount of exposure: activity decreased for canonical speech over time, whereas responses to non-canonical speech remained consistently elevated. Grainger causality analyses further revealed that prefrontal responses serve to modulate activity in auditory regions, suggesting the recruitment of top-down processing to decode non-canonical pronunciations. In sum, our results suggest that the behavioural deficit in processing mispronounced phonemes may be due to a disruption to the typical exchange of information between the prefrontal and auditory cortices as observed for canonical speech.
Highlights
Speech is a complex and ambiguous acoustic signal that varies significantly within and across speakers
In the left auditory cortex, we found a main effect of condition, such that attested and unattested substitutions consistently elicited more activity as compared to canonical pronunciations, between 188–600 ms
We test whether adaptation to non-canonical pronunciations occurs through the recalibration of early auditory processing, or whether it is supported by a higher-order repair mechanism
Summary
Speech is a complex and ambiguous acoustic signal that varies significantly within and across speakers. One prevalent example of such need for accommodation occurs when listening to an accented talker[11], and involves adjusting the mapping from acoustic input to phonemic categories through perceptual learning10,12–16; see[17,18] for a review) This adjustment can happen quickly: the lower bound is estimated to be approximately 10 sentences for non-native s peech[19,20], and within 30 sentences for noise-vocoded s peech[21]. While adaptation to non-canonical speech has been robustly reported behaviorally, to date, the majority of work examining the neural underpinnings of this process has focused on distorted or degraded speech, rather than systematic phonetic variation (e.g.21–23) These two types of manipulations tap into fundamentally different phenomena: a signal-to-noise problem in the degraded case, and a mapping problem in the variation case.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.