Abstract

In this study we focus on robust speech recognition in car environments. For this purpose we used weighted finite-state transducers (WFSTs) because they provide an elegant, uniform, and flexible way of integrating various knowledge sources into a single search network. To improve the robustness of the WFST speech recognition system, we performed nonlinear spectral subtraction (SS) to suppress noise from the noisy speech. Using the “clean” speech signal obtained from SS, we conducted supervised WFST network adaptation to the characteristics of a given driver. In the best case, for highly noisy conditions, the speaker dependent WFST decoder achieved 70 percentage points improvement when compared with traditional speaker independent speech recognition systems.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call