Abstract
Methods for noise robust speech recognition are often evaluated in small vocabulary speech recognition tasks. In this work, we use missing feature reconstruction for noise compensation in large vocabulary continuous speech recognition task with speech data recorded in noisy environments such as cafeterias. In addition, we combine missing feature reconstruction with constrained maximum likelihood linear regression (CMLLR) acoustic model adaptation and propose a new method for finding noise corrupted speech components for the missing feature approach. Using missing feature reconstruction on noisy speech is found to improve the speech recognition performance significantly. The relative error reduction 36% compared to the baseline is comparable to error reductions introduced with acoustic model adaptation, and results further improve when reconstruction and adaptation are used in parallel.
Paper version not known (Free)
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have