A key consideration in the development of real-time speech processing algorithms is latency between when speech is first captured and when the processed speech is transmitted. Much of the previous work informing the acceptability of processing latency has focused on the thresholds at which delays are detectable or annoying. This study aimed to explore the effects of delays on responding to a task necessitating speech understanding. Sentence-length audiovisual speech stimuli were modified such that the audio stream was delayed relative to the visual stream as well as to an attenuated copy of the audio stream by 10 and 400 ms, respectively. These stimuli were presented in conjunction with a pictorial referent-choice task to 50 normal-hearing participants. Participants responded slower to the task when the audio stream was delayed by 400 ms relative to the video stream. With an additional attenuated copy of the audio stream synchronized with the video stream, reaction time decreased, but this effect was limited to when the target word appeared near the onset of the sentence, highlighting the complexity of delay perception in the context of continuous speech. Implications for determining permissible processing delays for digital electronic hearing protection devices are discussed.
Read full abstract