Abstract

In this study we describe a binaural auditory model for recognition of speech in the presence of spatially separated noise intrusions, under small-room reverberation conditions. The principle underlying the model is to identify time–frequency regions which constitute reliable evidence of the speech signal. This is achieved both by determining the spatial location of the speech source, and by grouping the reliable regions according to common azimuth. Reliable time–frequency regions are passed to a ‘missing data’ speech recogniser, which performs decoding based on this partial description of the speech signal. In order to obtain robust estimates of spatial location in reverberant conditions, we incorporate some aspects of precedence effect processing into the auditory model. We show that the binaural auditory model improves speech recognition performance in small room reverberation conditions in the presence of spatially separated noise, particularly for conditions in which the spatial separation is 20° or larger. We also demonstrate that the binaural system outperforms a single channel approach, notably in cases where the target speech and noise intrusion have substantial spectral overlap.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call