Abstract

A change in talker is a change in the context for the phonetic interpretation of acoustic patterns of speech. Different talkers have different mappings between acoustic patterns and phonetic categories and listeners need to adapt to these differences. Despite this complexity, listeners are adept at comprehending speech in multiple-talker contexts, albeit at a slight but measurable performance cost (e.g., slower recognition). So far, this talker variability cost has been demonstrated only in audio-only speech. Other research in single-talker contexts have shown, however, that when listeners are able to see a talker’s face, speech recognition is improved under adverse listening (e.g., noise or distortion) conditions that can increase uncertainty in the mapping between acoustic patterns and phonetic categories. Does seeing a talker’s face reduce the cost of word recognition in multiple-talker contexts? We used a speeded word-monitoring task in which listeners make quick judgments about target word recognition in single- and multiple-talker contexts. Results show faster recognition performance in single-talker conditions compared to multiple-talker conditions for both audio-only and audio-visual speech. However, recognition time in a multiple-talker context was slower in the audio-visual condition compared to audio-only condition. These results suggest that seeing a talker’s face during speech perception may slow recognition by increasing the importance of talker identification, signaling to the listener a change in talker has occurred.

Highlights

  • In perceiving speech, we listen in order to understand what someone is saying as well as to understand who is saying it

  • In order to examine the effect of audio-visual information on the talker variability cost, a split plot analysis of variance (ANOVA) was carried out [Talker Variability (Single-Talker vs. MultipleTalker) × Modality of Presentation (Audio-only vs. Audio-visual), with Talker Variability as the within-subject factor and Modality of Presentation as a between-subject factor], for the dependent measures of response times (RTs), hit rate, false alarm rate, and d-prime

  • Research shows that talker variability hurts recognition accuracy (e.g., Creelman, 1957) and recognition speed (Mullennix and Pisoni, 1990; Magnuson and Nusbaum, 2007) providing what could be viewed as an adverse listening situation

Read more

Summary

Introduction

We listen in order to understand what someone is saying as well as to understand who is saying it. A given acoustic pattern may correspond to different phonemes, while a given phoneme may be represented by different acoustic patterns across different talkers (Peterson and Barney, 1952; Liberman et al, 1967; Dorman et al, 1977). For this reason, the speaker provides an important context to determine how acoustic patterns map on to phonetic categories (cf Nusbaum and Magnuson, 1997). Dialect (Niedzielski, 1999) and gender (Johnson et al, 1999) expectations can meaningfully alter vowel perception, highlighting that social knowledge about a speaker can affect the relatively lowlevel perceptual processing of a speaker’s message, much in the same way that knowledge of vocal tract information can (Ladefoged and Broadbent, 1957; see Huang and Holt, 2012 for an auditory explanation of the mechanism that could underlie this)

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.