Throughout his career, Ira Hirsh studied and published articles and books pertaining to many aspects of the auditory system. These included sound conduction in the ear, cochlear mechanics, masking, auditory localization, psychoacoustic behavior in animals, speech perception, medical and audiological applications, coupling between psychophysics and physiology, and ecological acoustics. However, it is Hirsh’s work on auditory timing of simple and complex rhythmic patterns, the backbone of speech and music that are at the heart of his more recent work. In this paper, we report on several aspects of temporal processing of speech signals, both within and across sensory systems. Data are presented on perceived simultaneity and intelligibility of auditory and auditory-visual speech stimuli where stimulus components are presented either synchronously or asynchronously. Differences in the symmetry and shape of temporal windows derived from these data sets are highlighted. Results show two distinct ranges for temporal integration for speech processing; one relatively short window, about 40 ms, and the other much longer, around 250 ms. In the case of auditory-visual speech processing, the temporal window is highly asymmetric, strongly favoring conditions where the visual stimulus precedes the acoustic stimulus. LEARNING OBJECTIVES 1) To show the connection between Hirsh’s work on the perception of temporal-order and recent work on the processing of asynchronous speech both cross-spectrally and cross- modally. 2) To compare and contrast the effects of temporal misalignment in auditory-alone and auditory-visual speech processing. Submitted to Seminars in Hearing 25 Effects of Spectro-Temporal Asynchrony on Speech Processing (Grant et al.) KEY WORDS Ira J. Hirsh, Temporal Order, Spectro-Temporal Integration, Temporal Speech Processing ABBREVIATIONS ms milliseconds CV Consonant-Vowel stimuli AxVy Incongruent auditory-visual speech tokens where the subjects hears the speech token x and simultaneously sees the speech token y. TWI Temporal Window of Integration ADS an Asymmetric Double Sigmoidal curve fit through the data. CEU QUESTIONS 1) What was the area of Hirsh’s research that occupied his primary interest in the latter part of his career? A) Measurement of hearing B) Sound reproduction C) Auditory physiology D) Temporal processing of simple and complex sounds E) Animal psychophysics Submitted to Seminars in Hearing 26 Effects of Spectro-Temporal Asynchrony on Speech Processing (Grant et al.) 2) The temporal window of integration for auditory-visual speech stimuli when the visual stimulus leads the acoustic stimulus is roughly A) 500 ms B) 250 ms C) 20 ms D) 40 ms E) 100 ms 3) For auditory-visual McGurk stimuli, as the acoustic and visual signals become more and more “out of sync” the perceived stimulus is dominated by the A) Visual stimulus B) Acoustic stimulus C) A noise-like stimulus that is part acoustic, part visual D) There is no dominant response once the auditory and visual signals are “out of sync” E) The subject’s response depends on their auditory and visual acuity 4) Spectro-temporal integration windows for auditory-only speech recognition are A) Symmetrical and long (about 250 ms) B) Symmetrical and short (about 40 ms) C) Asymmetrical and long (about 250 ms) D) Asymmetrical and short (about 40 ms) E) Symmetrical, but the length varies between 100-200 ms depending on the subject. Submitted to Seminars in Hearing 27 Effects of Spectro-Temporal Asynchrony on Speech Processing (Grant et al.) 5) The magnitude and shape of the temporal window of integration for auditory-visual speech A) Depends on whether the audio or visual signal is leading B) Is asymmetrical with a broad plateau region for visual leading conditions C) Is about 40 ms when the audio signal leads the visual signal and about 250 ms when the visual signal leads the audio signal D) Is similar for both speech identification and synchrony detection
Read full abstract