Abstract

This paper reports on an investigation into representing tone unit boundaries (pauses) as well as words in a corpus of spoken English. An analysis of data from MARSEC (Machine Readable Spoken English Corpus) shows that, for professional speakers, the inclusion of this minimal prosodic information will lower the perplexity of a language model. The analysis is based on information theoretic techniques, and an objective method of evaluation is provided by entropy indicators, which are explained. This result is of general interest, and supports the development of improved language models for many applications. The automated capture of pauses seems to be technically feasible, and warrants further investigation. The specific issue which prompted this investigation is a task in broadcasting technology: the semi-automated production of online subtitles for live television programmes. This task is described, and an approach to it using speech recognition technology is explained.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call