Abstract
Mismatch is known to degrade the performance of speech recognition systems. In real life applications we often encounter nonstationary mismatch sources. A general way to compensate for slowly time varying mismatch is by using sequential algorithms with forgetting. The choice of the forgetting factor is usually performed empirically on some development data, and no optimality criterion is used. In this paper we introduce a framework for obtaining optimal forgetting factor. In sequential algorithms, a recursion is usually used to calculate the required parameters so as to optimize a certain performance measure. To obtain optimal forgetting, we develop a recursion to calculate the forgetting factor that optimizes the same performance criterion as done in the original recursion. When combined together the two recursions result in a sequential algorithm that simultaneously optimizes the desired parameters and the forgetting factor. The proposed method is applied in conjunction with a sequential noise estimation algorithm, but the same principle can be extended to a wide range of sequential algorithms. The algorithm is extensively evaluated for different speech recognition tasks: the 5K Wall Street Journal task corrupted by different types of artificially added noise, a command and digit database recorded in a noisy car environment, and a 20K Japanese broadcast news task corrupted by field noise. In all situations it was found that the sequential algorithm performs as well as or better than batch estimation. In addition, the proposed optimal forgetting algorithm performs as well as the best hand tuned forgetting factor. This results in a continuously adaptive compensation technique without the need of any manual adjustment.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.