Listening to different speakers: On the time-course of perceptual compensation for vocal-tract characteristics

Matthias J Sjerps,Holger Mitterer,James M Mcqueen

doi:10.1016/j.neuropsychologia.2011.09.044

Matthias J Sjerps, Holger Mitterer + Show 1 more

Open Access

https://doi.org/10.1016/j.neuropsychologia.2011.09.044

Copy DOI

Abstract

This study used an active multiple-deviant oddball design to investigate the time-course of normalization processes that help listeners deal with between-speaker variability. Electroencephalograms were recorded while Dutch listeners heard sequences of non-words (standards and occasional deviants). Deviants were [ipapu] or [ɛpapu], and the standard was [Iɛpapu], where [Iɛ] was a vowel that was ambiguous between [ɛ] and [i]. These sequences were presented in two conditions, which differed with respect to the vocal-tract characteristics (i.e., the average 1st formant frequency) of the [papu] part, but not of the initial vowels [i], [ɛ] or [Iɛ] (these vowels were thus identical across conditions). Listeners more often detected a shift from [Iɛpapu] to [ɛpapu] than from [Iɛpapu] to [ipapu] in the high F1 context condition; the reverse was true in the low F1 context condition. This shows that listeners’ perception of vowels differs depending on the speaker's vocal-tract characteristics, as revealed in the speech surrounding those vowels. Cortical electrophysiological responses reflected this normalization process as early as about 120ms after vowel onset, which suggests that shifts in perception precede influences due to conscious biases or decision strategies. Listeners’ abilities to normalize for speaker-vocal-tract properties are for an important part the result of a process that influences representations of speech sounds early in the speech processing stream.

Full Text