Abstract

Accurate prosodic phrase prediction can improve the naturalness of speech synthesis. Predicting the prosodic phrase can be regarded as a sequence labeling problem and the Conditional Random Field (CRF) is typically used to solve it. Mongolian is an agglutinative language, in which massive words can be formed by concatenating these stems and suffixes. This character makes it difficult to build a Mongolian prosodic phrase predictions system, based on CRF, that has high performance. We introduce a new method that segments Mongolian word into stem and suffix as individual token. The proposed method integrates multiple features according to the characteristics of Mongolian word formation. We conduct the contrast experiment by selecting the following features: word, multi-level Part-of-Speech (POS), multi-level lexical for suffix and the existence for suffix. The experimental results show that our method has significantly enhanced the performance of the Mongolian prosodic phrase prediction system through comparing with the conventional method that treats Mongolian word as token directly. The word feature, level one lexical for suffix feature and existence for suffix feature are effective. The best result is measured by Fl-measure as 82.49%.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.