A computational model of binaural speech recognition: Role of across-frequency vs. within-frequency processing and internal noise

Kalle J Palomäki,Guy J Brown

doi:10.1016/j.specom.2011.03.005

Kalle J Palomäki, Guy J Brown

Open Access

https://doi.org/10.1016/j.specom.2011.03.005

Copy DOI

Journal: Speech Communication	Publication Date: Mar 24, 2011
Citations: 3	License type: cc-by-nc-nd

Affiliation: Aalto University, University of Sheffield

Abstract

This study describes a model of binaural speech recognition that is tested against psychoacoustic findings on binaural speech intelligibility in noise. It consists of models of the auditory periphery, binaural pathway and recognition of speech from glimpses based on the missing data approach, which allows the speech reception threshold (SRT) of the model and listeners to be compared. The binaural advantage based on differences between the interaural time differences (ITD) of the target and masker is modelled using the equalization–cancellation (EC) mechanism, either independently within each frequency channel or across all channels. The model is tested using a stimulus paradigm in which the target speech and noise interference are split into low- and high-frequency bands, so that the ITD in each band can be varied independently. The match between the model and listener data is quantified by a normalized SRT distance and a correlation metric, which demonstrate a slightly better match for the within-channel model (SRT: 0.5 dB, correlation: 0.94), than for the across-channel model (SRT: 0.7 dB, correlation: 0.90). However, as the differences between the approaches are small and non-significant, our results suggest that listeners exploit ITD via a mechanism that is neither fully frequency-dependent nor fully frequency-independent.

Full Text