Abstract

Typical hands-free voice capturing relies on recording a distant speaker using an array of microphones. When speech signal is recorded in a reverberant environment, the microphones capture speech corrupted by room reverberation. Increasing the distance between the speaker and recording microphones, the power ratio of the direct and late room reverberation signals is decreased, which in turn decreases the accuracy of automatic speech recognition. This paper presents a multichannel Wiener filter which exploits geometric information about early room reflection paths in order to reduce late reverberation at the filter output. As a result of such pre-processing, the accuracy of automatic speech recognition systems is improved. The presented word error rate (WER) results for simulated rooms with different reverberation times indicate that the proposed rake multichannel Wiener filtering provides an improvement in accuracy of automatic speech recognition from distance in reverberant environments. The gain in terms of word error rate achieved in considered acoustic conditions reaches several percent.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call