Abstract

In this paper, an interaural time difference (ITD) estimation method is proposed for binaural speech separation in reverberant environments. First, the auditory signals are represented in the time-frequency (T-F) domain, and the ITD for each T-F bin is then estimated using generalized cross-correlation (GCC) with a maximum likelihood (ML) weighting function. In particular, the ML weighting function is designed to reduce the reverberation effect. Then, a mask is estimated by comparing the estimated ITD with the ITD corresponding to the location of the pre-defined target speech source. Finally, the target speech is separated by applying the mask to the auditory signals. It is shown that the proposed ITD estimation method outperforms a conventional cross-correlation-based ITD estimation method under reverberant conditions in terms of the signal-to-noise ratio (SNR) and signal-to-distortion ratio (SDR) of the separated speech signals.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call