Abstract

A speech signal captured by a distant microphone is generally smeared by reverberation, which severely degrades both the speech intelligibility and Automatic Speech Recognition (ASR) performance. Previously, we proposed a single-microphone dereverberation method, named Harmonicity based dEReverBeration (HERB). HERB estimates the inverse filter for an unknown room transfer function by utilizing an essential feature of speech, namely harmonic structure. In previous studies, improvements in speech intelligibility was shown solely with spectrograms, and improvements in ASR performance were simply confirmed by matched condition acoustic model. In this paper, we undertook a further investigation of HERB's potential as regards to the above two factors. First, we examined speech intelligibility by means of objective indices. As a result, we found that HERB is capable of improving the speech intelligibility to approximately that of clean speech. Second, since HERB alone could not improve the ASR performance sufficiently, we further analyzed the HERB mechanism with a view to achieving further improvements. Taking the analysis results into account, we proposed an appropriate ASR configuration and conducted experiments. Experimental results confirmed that, if HERB is used with an ASR adaptation scheme such as MLLR and a multicondition acoustic model, it is very effective for improving ASR performance even in unknown severely reverberant environments.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call