Contribution of visual rhythmic information to speech perception in noise

Vincent Aubanel,Chris Davis,Jeesun Kim,Cassandra Masters

doi:10.21437/avsp.2017-18

Abstract

Visual speech information helps listeners perceive speech in noise. The cues underpinning this visual advantage appear to be global and distributed, and previous research hasn't succeeded in pinning down simple dimensions to explain the effect. In this study we focus on the temporal aspects of visual speech cues. In comparison to a baseline of auditory only sentences mixed with noise, we tested the effect of making available a visual speech signal that carries the rhythm of the spoken sentence, through a temporal visual mask function linked to the times of the auditory p-centers, as quantified by stressed syllable onsets. We systematically varied the relative alignment of the peaks of the maximum exposure of visual speech cues with the presumed anchors of sentence rhythm and contrasted these speech cues against an abstract visual condition, whereby the visual signal consisted of a stylised moving curve with its dynamics determined by the mask function. We found that both visual signal types provided a significant benefit to speech recognition in noise, with the speech cues providing the largest benefit. The benefit was largely independent of the amount of delay in relation to the auditory p-centers. Taken together, the results call for further inquiry into temporal dynamics of visual and auditory speech.

Full Text