Abstract
Recently, hidden Markov models (HMM) have been applied successfully to both isolated and connected word recognition. However, when the same formulation is adopted for recognition of more confusable vocabularies, like English alphabets, the recognition performance is often less satisfactory. One main reason is that robustness issues, such as model validity, recognition parameter selection, model parameter initialization, model parameter estimation, training sample size, and durational information incorporation, can no longer be ignored. In this paper, a stochastic segment model (SSM) is proposed, which is a simplified HMM, for speech recognition. Three specific robustness issues are then discussed, namely the choice of observation densities, the initialization of model parameters, and the incorporation of duration information. In a step-by-step attempt to address those issues, it was found that the same SSM formulation can still be adopted if acoustic and phonetic knowledge about the vocabulary is taken into account in the model parameter estimation and recognition phases. Testing on the 39-word English alpha-digit vocabulary indicates that the recognition performance, based on conventional HMM techniques, can be signficantly improved if model parameters are adequately initialized and durational information is properly incorporated.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.