Probabilistic estimation of lexical stress

Ghassan J Freij

doi:10.1121/1.2024777

Abstract

A probabilistic technique is described for the automatic estimation of lexical stress pattern of isolated words using hidden Markov models (HMMs) with continuous asymmetric probability density functions. Stress information is potentially useful for large-vocabulary speech recognition in that it facilitates the extraction of more robust phonetic data within stressed syllables and leads to more efficient lexical access. Adopting a binary stressed-unstressed strategy, a constrained HMM network was trained using observation vectors consisting of ten acoustic measurements of intensity, pitch, and spectral balance. These measurements were extracted from a corpus of disyllabic stress-minimal word pairs, each word embedded in a phrase and repeated eight times. Further, the underlying temporal structure of the data is represented implicitly by the probability of remaining in a state in relation to the probability of exiting from it. Using this technique, correctness of classification was measured to be 89% for stressed syllables and 81% for unstressed syllables.

Full Text