Recognition of lexical stress in a continuous speech understanding system - A pattern recognition approach

A Waibel

doi:10.1109/icassp.1986.1168788

Abstract

Stress is one of the key components in human speech perception. Its uses extend from the phonetic level over the lexical to the syntactic and semantic level. Several methods have been developed in the past to detect stress automatically from the signal. This paper takes a pattern recognition approach to the the problem of stress detection. The algorithm presented has three key features: (1) optimal combination of the evidence obtained from the acoustic correlates of stress is achieved by means of a Bayesian classifier assuming multivariate Gaussian distributions; (2) the algorithm detects lexical stress in continuously spoken English utterances; (3) rather than making hard decisions, the algorithm returns probabilities for each syllable, i.e., a measure of stressedness. The algorithm was tested over 4 databases of differing continuous speech data. When a forced decision is imposed by setting a threshold at stress probability 0.5, error rates of 7.79% to 14.85% missed stresses were obtained. Unlike in other languages (such as Japanese), amplitude integrals are the strongest predictor of English stress. Performance results and an analysis of errors are presented.

Full Text