Assigning phrase breaks from part-of-speech sequences

Paul Taylor,Alan W Black

doi:10.1006/csla.1998.0041

Assigning phrase breaks from part-of-speech sequences

Paul Taylor, Alan W Black

Open Access

https://doi.org/10.1006/csla.1998.0041

Copy DOI

Journal: Computer Speech & Language	Publication Date: Apr 1, 1998
Citations: 213

Affiliation: University of Edinburgh

#Phrase Breaks #Markov Model + Show 5 more

Abstract
Full-Text PDF
Similar Papers

Abstract

This paper presents an algorithm for automatically assigning phrase breaks to unrestricted text for use in a text-to-speech synthesizer. Text is first converted into a sequence of part-of-speech tags. Next a Markov model is used to give the most likely sequence of phrase breaks for the input part-of-speech tags. In the Markov model, states represent types of phrase break and the transitions between states represent the likelihoods of sequences of phrase types occurring. The paper reports a variety of experiments investigating part-of-speech tag-sets, Markov model structure and smoothing. The best setup correctly identifies 79% of breaks in the test corpus.

Full Text