Investigating the nature of speech representations by contrasting computational models against behavioral data

Maryann Tan,T Florian Jaeger

doi:10.1121/10.0018238

Abstract

How do listeners interpret speech input when exposure does not contain information about the talker’s category representations? Competing accounts are rarely contrasted. We implement competing hypotheses within the same general computational framework (Bayesian inference). All models were trained on phonetic productions to predict perception, reflecting the hypothesis that listeners learn category representations from the speech input. We compare them against two experiments on the perception of L1-US English stop voicing (N = 24 and 122). Both experiments used minimal pairs (e.g., tin/din) varying in the primary cue (VOT) while keeping secondary cues (f0 and vowel duration) at expected correlations. VOT values occurred equally often and spanned the range observed in US English. We find that (1) models that integrated perceptual noise performed better in predicting categorization responses than those without; (2) models with multiple cues performed better than those with just VOT; (3) models trained on talker-normalized phonetic cues performed better than those trained on unnormalized cues; and, surprisingly, (4) models that also normalized the novel speech input during the experiment performed worse than those that did not. (3) and (4) suggest that listeners’ long-term representations are based on talker-normalized cues but require *labelled* input—contrary to most normalization accounts.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Investigating the nature of speech representations by contrasting computational models against behavioral data

Abstract

Talk to us

Similar Papers

More From: The Journal of the Acoustical Society of America

Lead the way for us

Similar Papers

Dimension-based statistical learning of vowels.
Ran Liu ... Lori L Holt
Journal of Experimental Psychology: Human Perception and Performance | VOL. 41
Ran Liu, et. al.Ran Liu ... Lori L Holt
01 Dec 2015
Journal of Experimental Psychology: Human Perception and Performance | VOL. 41

Lexical similarity and speech production: Neighborhoods for nonwords
Rebecca Scarborough
Lingua | VOL. 122
Rebecca ScarboroughRebecca Scarborough
03 Sep 2011
Lingua | VOL. 122

Determinants of brand equity - A blue print for building strong brand: A study of automobile segment in India
...
-
, et. al. ...
31 Jul 2009
31 Jul 2009

Effects of high variability phonetic training on cue reweighting in non-native vowel perception by adult Chinese learners
Bing Cheng ... Xiaojuan Zhang
The Journal of the Acoustical Society of America | VOL. 148
Bing Cheng, et. al.Bing Cheng ... Xiaojuan Zhang
01 Oct 2020
The Journal of the Acoustical Society of America | VOL. 148

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Investigating the nature of speech representations by contrasting computational models against behavioral data

Abstract

Talk to us

Similar Papers

More From: The Journal of the Acoustical Society of America