Abstract
Synthesis unit of a speech synthesizer directly affects the computational load and output speech quality. Generally, phoneme is the best choice to synthesize high quality speech. But it requires the knowledge of language to precisely draw the segmentation of words into phonemes. And it is expensive to compose an accurate phoneme dictionary. In this study, another type of synthesis unit is introduced which is letter. In Malay language, the unit size of letter is smaller than phoneme. And using letter as the synthesis unit could ease a lot of efforts because the context label can be created in fully automatic manner without the knowledge of the language. Four systems have been created and an investigation was done to find out how synthesis unit could affect the quality of synthetic speech. Forty eight listeners were hired to rate the output speech individually and result showed that no obvious difference between the output speech synthesized using different synthesis units. Listening test showed satisfactory result in terms of similarity, naturalness and intelligibility. Synthetic speech with polyphonic label showed increment in intelligibility compared to synthetic speech without polyphonic label. Using letter as the synthesis unit is recommended because it excludes the dependency of linguist and expands the idea of language independent front end text processing.
Highlights
Speech synthesis is a process of transforming textual representation of speech into waveform (Lim et al, 2012)
Where, S is the number of substitution, D is the number of deletion, I is the number of insertion and C is the Word Error Rate (WER) (%) 2.3 8.2 14.4 9.8 number of correct words
We have presented the first statistical parametric speech synthesis system in Malay language and evaluated the effect of different synthesis unit towards the quality of output speech
Summary
Speech synthesis is a process of transforming textual representation of speech into waveform (Lim et al, 2012). We built Malay speech synthesizers using phoneme and letter as synthesis unit. An investigation was conducted to observe the effects of different synthesis unit toward synthetic speech. For system using phoneme as synthesis unit, a dictionary was referred to find out the pronunciation of the words.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Research Journal of Applied Sciences, Engineering and Technology
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.