Abstract
Problem statement: In Thai speech synthesis using Hidden Markov model (HMM) based synthesis system, the tonal speech quality is degraded due to tone distortion. This major problem must be treated appropriately to preserve the tone characteristics of each syllable unit. Since tone brings about the intelligibility of the synthesized speech. It is needed to establish the tone questions and other phonetic questions in tree-based context clustering process accordingly. Approach: This study describes the analysis of questions in tree-based context clustering process of an HMM-based speech synthesis system for Thai language. In the system, spectrum, pitch or F0 and state duration are modeled simultaneously in a unified framework of HMM, their parameter distributions are clustered independently by using a decision-tree based context clustering technique. The contextual factors which affect spectrum, pitch and duration, i.e., part of speech, position and number of phones in a syllable, position and number of syllables in a word, position and number of words in a sentence, phone type and tone type, are taken into account for constructing the questions of the decision tree. All in all, thirteen sets of questions are analyzed in comparison. Results: In the experiment, we analyzed the decision trees by counting the number of questions in each node coming from those thirteen sets and by calculating the dominance score given to each question as the reciprocal of the distance from the root node to the question node. The highest number and dominance score are of the set of phonetic type, while the second, third highest ones are of the set of part of speech and tone type. Conclusion: By counting the number of questions in each node and calculating the dominance score, we can set the priority of each question set. All in all, the analysis results bring about further development of Thai speech synthesis with efficient context clustering process in an HMM-based speech synthesis system.
Highlights
IntroductionThe context clustering is an important process to treat the problem of limitation of
To analyze the contribution of each set of contextual factors, we explored 3 decision trees generated in the clustering process at the training stage of the system including spectrum, F0 and state duration trees
We describe the analysis of questions in tree-based context clustering process of an HMMbased speech synthesis system for Thai language
Summary
The context clustering is an important process to treat the problem of limitation of. The systematic analysis of the decision trees in the context clustering process has not been thoroughly carried out yet. This study could bring about an appropriate construction of question sets for the decision trees. In the HMM-based speech synthesis framework, training data. Information sharing of training data in the same cluster or the terminal node (tree leaf) in the decision-tree-based context clustering is the essential concept, construction of contextual factors and design of tree structure for the decision-tree-based context clustering must be done appropriately. This study focuses mainly on the analysis of the decision trees in the context clustering process. The the sentence HMMs associated with the arbitrary given system consists of two stages including the training
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.