Abstract
In this paper, a new acoustic model called time-inhomogeneous hidden Bernoulli model (TI-HBM) is introduced as an alternative to hidden Markov model (HMM) in continuous speech recognition. Contrary to HMM, the state transition process in TI-HBM is not a Markov process, rather it is an independent (generalized Bernoulli) process. This difference leads to elimination of dynamic programming at state-level in TI-HBM decoding process. Thus, the computational complexity of TI-HBM for probability evaluation and state estimation is O ( NL ) (instead of O ( N 2 L ) in the HMM case, where N and L are number of states and sequence length respectively). As a new framework for phone duration modeling, TI-HBM is able to model acoustic-unit duration (e.g. phone duration) by using a built-in parameter named survival probability. Similar to the HMM case, three essential problems in TI-HBM have been solved. An EM-algorithm-based method has been proposed for training TI-HBM parameters. Experiments in phone recognition for Persian (Farsi) spoken language show that the TI-HBM has some advantages over HMM (e.g. more simplicity and increased speed in recognition phase), and also outperforms HMM in terms of phone recognition accuracy.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.