Abstract

Most phonological contrasts are signaled by multiple acoustic cues, yet it is unclear how these cues are combined during speech perception. Formal computational modeling offers a useful tool for studying this process. Two computational approaches are presented here. The first is a mixture of Gaussians (MOG) model that forms categories and combines cues based on their statistical distributions [Toscano and McMurray, Proceedings of the Cognitive Science Society (2008)]. The second is a neural network model that combines statistical learning and dynamic online processing [McMurray and Spivey, Proceedings of the Chicago Linguistic Society (1999)]. Both the MOG and the network use the statistical distributions of speech sounds to form categories. The MOG offers transparency in that its categories correspond directly to distributional statistics measured from phonetic data. However, it does not capture the online processing observed in behavioral experiments that suggest that the speech system makes preliminary commitments before all cues are available [McMurray, Clayards, Tanenhaus, and Aslin (submitted)]. The network offers an approach that may allow us to observe this processing. Thus, while the MOG may better clarify the relationship between acoustics and phonological categories, the network may better model the process of speech perception. [Work supported by NIH.]

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.