A context-based approach to predict speech intelligibility in interrupted noise: Model design.

Jelmer van Schoonhoven,Koenraad S. Rhebergen,Wouter A. Dreschler

doi:10.1121/10.0009617

Abstract

The Extended Speech Transmission Index (ESTI) by van Schoonhoven et al. [(2019). J. Acoust. Soc. Am. 145, 1178-1194] was used successfully to predict intelligibility of sentences in fluctuating background noise. However, prediction accuracy was poor when the modulation frequency of the masker was low (<8 Hz). In the current paper, the ESTI was calculated per phoneme to estimate phoneme intelligibility. In the next step, the ESTI model was combined with one of two context models {Boothroyd and Nittrouer, [(1988). J. Acoust. Soc. Am. 84, 101-114]; Bronkhorst et al., [(1993). J. Acoust. Soc. Am. 93, 499-509} in order to improve model predictions. This approach was validated using interrupted speech data, after which it was used to predict speech intelligibility of words in interrupted noise. Model predictions improved using this new method, especially for maskers with interruption rates below 5 Hz. Calculating the ESTI at phoneme level combined with a context model is therefore a viable option to improve prediction accuracy.

Full Text