Abstract

Summary form only given. Proposed is a method for removing reflected waves from a mixed wave consisting of a direct signal and reflected waves. The method is a kind of waveform subtraction referring to autocorrelation functions (ACF) of multichannel speech signals. A reflected wave is assumed to have two parameters; path amplitude and delay time. The method estimates these parameters based on ACF of signals received by microphones. The delay time of a particular path is estimated as the time lag that gives the maximum difference between the ACF of the channel concerned and the average ACF of the other channels. The delayed wave is subtracted from the received wave using an estimated delay only for vocal segments, fricative-like and nasal-like segments are left as they are, and conventional spectral subtraction is applied to the rest of the input speech. The rate of waveform subtraction, or the path amplitude of a reflected wave, is estimated by minimizing the difference between the ACF of the signal concerned and the average ACF of the rest at the time delay attributed to the reflection path in concern. The proposed method can be realized without a priori knowledge about room characteristics or the target speech. Speech recognition rate for the signals picked up with 3 microphones in a reverberant environment is improved about 8% employing the proposed method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call