Abstract

Results of decades of research on vowels support the conclusion that perception and production of language-specific vowel categories cannot be based on invariant targets that are represented directly in either the auditory domain or the articulatory (sensorimotor) domain. This raises a number of questions about how an infant can acquire the cognitive representations relevant for learning the vowels of the ambient language. Some models of the acquisition process assume a fixed auditory transform to normalize for talker vocal tract size (e.g., Callan et al., 2000), ignoring evidence that normalization must be culture-specific (e.g., Johnson, 2005). Others assume that learning can be based on statistical regularities solely within the auditory domain (e.g., Assmann and Nearey, 2008), ignoring evidence that articulatory experience also shapes vowel category learning (e.g., Kamen and Watson, 1991). This paper outlines an alternative approach that models cross-modal learning. The approach aligns graph structures, called “manifolds,” which organize sensory information in the auditory and in the articulatory domain. Graph alignment is guided by perceptual targets that are internalized in early infancy through social/vocal interaction with caregivers, so that vowel categories can be identified with the abstractions that mediate between the two domains in the alignment process.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.