Age effects in the understanding of noisy speech denoised by estimated ideal binary masks.

Pierre Divenyi,Deliang Wang,Adam Lammert,Nicholas Livingston,Ke Hu

doi:10.1121/1.3384768

Abstract

An algorithm was developed to denoise speech presented in matching speech-spectrum noise by estimating the ideal binary mask (IBM) that retains the time-frequency (T-F) regions with favorable local signal-to-noise ratio (SNR) and removes the remaining regions [Hu et al., in Proceedings of the 4th SAPA Workshop, Interspeech 2008]. With this method sentence intelligibility was tested in 20 elderly and 20 normal-hearing young listeners. The elderly group had, at worst, moderate SNHL between 0.5 and 4 kHz. Speech material consisted of IEEE sentences spoken by a single female talker. Intelligibility was tested at two SNRs in the elderly (−2 and −4 dB) and three SNRs in the young group (−4, −6, and −8 dB). Subjects listened to 50 sentences at each SNR in three conditions: denoising by the estimated IBM by the algorithm, denoising by the IBM calculated from premixed signals, and the sentences presented in the original noise without processing. With this last condition taken as the baseline, both keyword- and phonemic intelligibility of the sentences processed with the algorithm showed a slight improvement for the young but no improvement for the elderly listeners, while both groups performed substantially better than the baseline in the IBM condition. [Work supported by the VA Medical Research.]

Full Text