Effect of reducing slow temporal modulations on speech reception

Rob Drullman,Joost M Festen,Reinier Plomp

doi:10.1121/1.409836

Abstract

The effect of reducing low-frequency modulations in the temporal envelope on the speech-reception threshold (SRT) for sentences in noise and on phoneme identification was investigated. For this purpose, speech was split up into a series of frequency bands (1/4, 1/2, or 1 oct wide) and the amplitude envelope for each band was high-pass filtered at cutoff frequencies of 1, 2, 4, 8, 16, 32, 64, or 128 Hz, or infinity (completely flattened). Results for 42 normal-hearing listeners show: (1) A clear reduction in sentence intelligibility with narrow-band processing for cutoff frequencies above 64 Hz; and (2) no reduction of sentence intelligibility when only amplitude variations below 4 Hz are reduced. Based on the modulation transfer function of some conditions, it is concluded that fast multichannel dynamic compression leads to an insignificant change in masked SRT. Combining these results with previous data on low-pass envelope filtering (temporal smearing) [Drullman et al., J. Acoust. Soc. Am. 95, 1053-1064 (1994)] shows that at 8-10 Hz the temporal modulation spectrum is divided into two equally important parts. Vowel and consonant identification with nonsense syllables were studied for cutoff frequencies of 2, 8, 32, 128 Hz, and infinity, processed in 1/4-oct bands. Results for 12 subjects indicate that, just as for low-pass envelope filtering, consonants are more affected than vowels. Errors in vowel identification mainly consist of reduced recognition of diphthongs and of durational confusions. For the consonants there are no clear confusion patterns, but stops appear to suffer least. In most cases, the responses tend to fall into the correct category (stop, fricative, or vowel-like).

Full Text