Abstract

Robust speech recognition systems must address variations due to perceptually induced stress in order to maintain acceptable levels of performance in adverse conditions. This study proposes a new approach which combines stress classification and speech recognition into one algorithm. This is accomplished by generalizing the one-dimensional hidden Markov model to a multi-dimensional hidden Markov model (N-D HMM) where each stressed speech style is allocated a dimension in the N-D HMM. It is shown that this formulation better integrates perceptually induced stress effects for stress independent recognition. This is due to the sub-phoneme (state level) stress classification that is implicitly performed by the algorithm. The proposed N-D HMM method is compared to neutral and multi-styled stress trained 1-D HMM recognizers. Average recognition rates are shown to improve by +15.72% over the 1-D stress dependent recognizer and 26.67% over the 1-D neutral trained recognizer.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call