Abstract

The search over context-dependent continuous density Hidden Markov Models (HMMs), including state-likelihood computations, accounts for a considerable part of the total decoding time for a speech recognizer. This is especially apparent in tasks that incorporate large vocabularies and long-dependency n-gram grammars, since these impose a high degree of context dependency and HMMs have to be treated differently in each context. This paper proposes a strategy for acoustic match of typical continuous density HMMs, decoupled from the main search and conducted as a separate component suited for parallelization. Instead of computing a large amount of probabilities for different alignments of each HMM, the proposed method computes all alignments, but more efficiently. Each HMM is matched only once against any time interval, and thus may be instantly looked up by the main search algorithm as required. In order to accomplish this in real time, a fast time-warping match algorithm is proposed, exploiting the specifics of the 3-state left-to-right HMM topology without skips. In proof-of-concept tests, using a highly optimized SIMD-parallel implementation, the algorithm was able to perform time-synchronous decoupled evaluation of a triphone acoustic model, with maximum phone duration of 40 frames, with a real-time factor of 0.83 on one of the CPUs of a Dual-Xeon 2 GHz workstation. The algorithm was able to compute the likelihood for 636,000 locally optimal HMM paths/second, with full state evaluation.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.