Abstract

A microphone array can be used for hands-free acquisition of speech under reverberant conditions. This requires knowledge about the desired talker location, which can be obtained by estimating the time delays between the signals received by one or more pairs of spatially separated microphones. However, in a typical audio-conference room, strong reverberation is usually present and can have disastrous effects on the performance of conventional time delay estimation (TDE) methods. In this article, we present and evaluate a new cepstral prefiltering technique which can be applied on the received signals before the actual TDE in order to obtain a more accurate estimate of the delay in a typical reverberant environment. The technique is based on the estimation and the subtraction of the minimum-phase component (MPC) of the channel cepstrum from the total cepstrum of each microphone signal. So, in the same way that it is necessary in certain TDE methods to estimate the power spectral densities of the signals of interest from the received data, the new method requires the estimation of the channel MPC in the cepstral domain. The performances of a TDE system with and without cepstral prefiltering are compared via Monte-Carlo simulations for fixed random and speech sources as well as for a moving random source. The results clearly demonstrate the beneficial effects of the new cepstral prefiltering technique on TDE performance when the source is fixed or slowly moving.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call