Abstract
Imitating speech necessitates the transformation from sensory targets to vocal tract motor output, yet little is known about the representational basis of this process in the human brain. Here, we address this question by using real-time MR imaging (rtMRI) of the vocal tract and functional MRI (fMRI) of the brain in a speech imitation paradigm. Participants trained on imitating a native vowel and a similar nonnative vowel that required lip rounding. Later, participants imitated these vowels and an untrained vowel pair during separate fMRI and rtMRI runs. Univariate fMRI analyses revealed that regions including left inferior frontal gyrus were more active during sensorimotor transformation (ST) and production of nonnative vowels, compared with native vowels; further, ST for nonnative vowels activated somatomotor cortex bilaterally, compared with ST of native vowels. Using test representational similarity analysis (RSA) models constructed from participants’ vocal tract images and from stimulus formant distances, we found that RSA searchlight analyses of fMRI data showed either type of model could be represented in somatomotor, temporal, cerebellar, and hippocampal neural activation patterns during ST. We thus provide the first evidence of widespread and robust cortical and subcortical neural representation of vocal tract and/or formant parameters, during prearticulatory ST.
Highlights
Speech imitation is a complex and multistage process that requires the interaction of both sensory and motor systems, such that acoustic inputs can be processed, transformed to target motor outputs, and articulated as speech
While the primary focus of our present representational similarity analysis (RSA) analyses was on sensorimotor transformation (ST) and speech imitation, we considered whether representation of the acoustically based and vocal tract image-based Representational Dissimilarity Matrix (RDM) would emerge during passive listening
Our results shed light on the extensive functional brain networks involved in preparing to articulate imitations of vowels that varied in familiarity; these results unveil the topography of regions involved in ST for vowel categories differing in their articulatory and acoustical properties, over and above results obtained using more traditional univariate blood oxygen leveldependent (BOLD) analyses
Summary
Speech imitation is a complex and multistage process that requires the interaction of both sensory and motor systems, such that acoustic inputs can be processed, transformed to target motor outputs, and articulated as speech (see Guenther 2006; Bohland et al 2010; Guenther and Vladusich 2012). Accounts proposed that the perceptual components of this multistage process hinge upon central speech representations that occur at the subphonemic level; these representations would code for the motor effectors necessary for speech articulation, during initial perception of the speech signal (Liberman et al 1967). The predictions of this motor theory of speech. While some have argued for a central role of posterior Sylvian regions (Sylvian-parietal-temporal) in transforming from sensory representations to motor output (Hickok and Buchsbaum 2003; Hickok and Poeppel 2007; Hickok et al 2009; Hickok 2012), others have failed to support these claims (Parker-Jones et al 2014) or have suggested the involvement of more widespread sensory and motor regions (Cogan et al 2014; Simmonds et al 2014b)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.