Abstract

The extent that articulatory information embedded in incoming speech contributes to the formation of new perceptual categories for speech sounds has been a matter of discourse for decades. It has been theorized that the acquisition of new speech sound categories requires a network of sensory and speech motor cortical areas (the “dorsal stream”) to successfully integrate auditory and articulatory information. However, it is possible that these brain regions are not sensitive specifically to articulatory information, but instead are sensitive to the abstract phonological categories being learned. We tested this hypothesis by training participants over the course of several days on an articulable non-native speech contrast and acoustically matched inarticulable nonspeech analogues. After reaching comparable levels of proficiency with the two sets of stimuli, activation was measured in fMRI as participants passively listened to both sound types. Decoding of category membership for the articulable speech contrast alone revealed a series of left and right hemisphere regions outside of the dorsal stream that have previously been implicated in the emergence of non-native speech sound categories, while no regions could successfully decode the inarticulable nonspeech contrast. Although activation patterns in the left inferior frontal gyrus, the middle temporal gyrus, and the supplementary motor area provided better information for decoding articulable (speech) sounds compared to the inarticulable (sine wave) sounds, the finding that dorsal stream regions do not emerge as good decoders of the articulable contrast alone suggests that other factors, including the strength and structure of the emerging speech categories are more likely drivers of dorsal stream activation for novel sound learning.

Highlights

  • Whether an infant learning her first language, or an adult learning his fifth language, in language acquisition the learner must learn to perceive as well as produce new speech sounds

  • Output from the analysis of variance (ANOVA) table generated by the afex command “mixed” is reported for all mixed-effects model analyses in this study

  • Previous work has observed that the thalamus is sensitive to human speech sounds: Dehaene-Lambertz et al (2005) found that the thalamus was generally more active for speech rather than sine-wave speech analogues in humans, while Kraus et al (1994) found that the guinea pig thalamus is sensitive to complex spectral differences between human speech sounds even when animals were not exposed to any kind of training on these sounds. More relevant to this goal of this study is the relationship between the thalamus and articulatory information; neuropsychological investigations have found that damage to the thalamus often yields difficulties with the articulation of speech sounds (Jonas, 1982; Wallesch et al, 1983), and our results suggest that during the learning of new speech sounds, the thalamus may be representing the articulatory codes that will later be used for production, similar to the role of the dorsal stream in the dual stream model

Read more

Summary

Introduction

Whether an infant learning her first language, or an adult learning his fifth language, in language acquisition the learner must learn to perceive as well as produce new speech sounds. Models of speech perception like the motor theory of speech (Liberman, Cooper, Shankweiler, & Studdert-Kennedy, 1967) and subsequent direct realist approaches (Best, 1995; Fowler, 1986), make explicit predictions about the formative role of articulatory codes (i.e., the gestures made by the articulators to create a speech sound) in speech perception. In these models, motor or articulatory representations are the objects of perception, and must be acquired and accessed to achieve robust comprehension (see Galantucci, Fowler, & Turvey, 2006 for a review). Learning to discriminate between a pair of speech sounds like /b/ and /d/ (which are similar acoustically) requires that the listener access information the articulatory gestures used to produce those sounds

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call