Abstract
This work describes a neural network model of speech motor skill acquisition and speech production that explains a wide range of data on contextual variability, motor equivalence, coarticulation, and speaking rate effects. Model parameters are learned during a babbling phase. To explain how infants learn phoneme-specific and language-specific limits on acceptable articulatory variability, the learned speech sound targets take the form of regions, or convexhulls, in orosensory coordinates. This leads to an explanation of coarticulation wherein the target for a speech sound is reduced in size based on context to provide a more efficient sequence of articulator movements. Furthermore, reduction of target size for better accuracy during slower speech (in accordance with Fitt’s law) leads to differential effects for vowels and consonants, as seen in speaking rate experiments that were previously explained by positing separate control processes for the two sound classes. The babbling process also naturally accounts for the formation of coordinativestructures, or groups of articulator movements marshalled together to perform orosensory tasks. Coordinative structures provide motor equivalence, including automatic compensation to perturbations or constraints on the articulators. Computer simulations verify the model’s motor equivalence, coarticulation, and speaking rate properties. [Work partially supported by AFOSR F49620-92-J-0499.]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.