The effect of visual information on word initial consonant perception of dysarthric speech

R.P Schumeyer,K.E Barner

doi:10.1109/icslp.1996.607021

Abstract

Disabled individuals can realize many benefits from automatic speech recognition (ASR). To date, most ASR research has focused on normal speech. However, many individuals with physical disabilities also exhibit speech disorders. While limited research has been conducted focusing on dysarthric speech recognition, the preliminary results indicate that additional study is necessary. Recently, increasing attention has been given to multimodal speech recognition schemes that utilize multiple input sources-most commonly audio and video. This multimodal approach has been applied to normal speech with demonstrated effectiveness. Through studying the effect of audio and visual information in a human perception experiment, this study attempts to discover whether such an approach would be useful for dysarthric speech recognition. Results of a closed-vocabulary perception test are presented. In this test, 15 normal-hearing viewers were presented with video-tapes of three dysarthric speakers (with cerebral palsy) speaking a series of one-syllable nonsense words. These words differed only in the initial consonant. The words were presented in both audio-only and audio-visual modes. Perception rates in both modes were measured. The results are analyzed and compared to other studies of visual speech perception and dysarthric speech articulation.

Full Text