Abstract

Prosodic phrase structure provides important information for the understanding and naturalness of synthetic speech, and a good model of prosodic phrases has applications in both speech synthesis and understanding. This work describes a statistical model of an embedded hierarchy of prosodic phrase structure, motivated by results in linguistic theory. Each level of the hierarchy is modeled as a sequence of subunits at the next level, with the lowest level of the hierarchy representing factors such as syntactic branching and prosodic constituent length using a binary tree classification. A maximum likelihood solution for parameter estimation is presented, allowing automatic training of different speaking styles. For predicting prosodic phrase breaks from text, a dynamic programming algorithm is given for finding the maximum probability prosodic parse. In experiments on a corpus of radio news, the model has been used to predict minor and major prosodic phrase boundaries from text in the hierarchy: orthographic word ∈ minor phrase ∈ major phrase ∈ sentence. Good results were obtained by using only a simple function word table for part-of-speech assignments and sentence punctuation; use of syntactic information did not improve performance. Out of 23 test sentences, 7 had predicted phrase breaks that matched those used by one of five radio announcers and an additional ten sentences were judged to have acceptable prosodic phrasing. This performance corresponds to correctly predicting a phrase break (major or minor) at 79% of the observed phrase locations and falsely predicting a boundary at only 4% of the boundaries where no break was observed. In comparison, the performance of a tree classifier without the hierarchical formalism yields 62% correct prediction and 3% false prediction.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.