Detection of prosodic word boundaries by statistical modeling of mora transitions of fundamental frequency contours and its use for continuous speech recognition

K Hirose,K Iwano

doi:10.1109/icassp.2000.862094

Abstract

We have been developing a reliable method of prosodic word boundary detection for Japanese continuous speech based on the statistical modeling of mora transitions of fundamental frequency contours of prosodic words. Modifications in the codebook sizes and in the HMM topologies improved the boundary detection performance. When using mora boundary information obtainable from the phoneme recognition process, the detection rates were reached around 73% with 12.5% insertion errors for speaker-open experiments. This method was then integrated to a continuous speech recognition system with unlimited vocabulary. The integrated system conducts the recognition process in two stages: the first stage is to detect mora boundaries without prosodic information and the second stage is to increase the mora recognition rate using prosodic word boundary information. Slight improvements in mora recognition rates were observed both in speaker-closed and -open experiments.

Full Text