Abstract
One of the key issues in practical speech recognition is to achieve robustness against the environmental mismatches resulting from the background noises or different channels. Most of the conventional approaches have tried to compensate for the effects of such mismatches based on the assumption that the environmental characteristics are stationary, which, however, is far from the real observation. In this paper, we propose an approach to cope with time-varying environmental characteristics. With a direct modeling of the environment evolution process and the clean speech feature distribution, we construct a set of multiple linear state space models. Suboptimal state estimation under the given model structure can be efficiently performed with the interacting multiple model (IMM) algorithm. In addition to providing a comprehensive description of the compensation technique, we propose an adaptive Kalman filtering approach with which nonstationary noise evolution characteristics can be tracked. Moreover, we propose a novel way to do fixed-interval smoothing within the IMM framework. Performance of the presented compensation technique in both the slowly and rapidly varying noise conditions is evaluated through a number of continuous digit recognition experiments.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.