The Indonesian Language speech synthesizer based on the Hidden Markov Model

Kevin Alfianto Jangtjik,Dessi Fuji Lestari

doi:10.1109/iceecs.2014.7045211

Abstract

Speech synthesizer is a technology which gives the computer a capability to speech text sequences. In this research, we develop a speech synthesizer for Indonesian Language based on the Hidden Markov Model (HMM). The Speech synthesizer using the HMM can produce more appropriate result than based on the syllable concatenation. There are some studies of speech synthesizers using HMM for Indonesian Language. However, it still has some problems such as it still cannot distinguish between vowel "e" ("e" in "get" is different from "e" in "apple"); It cannot handle abbreviation, numbers, special characters, and foreign (English) terms widely. In this research, we also proposed some methods to solve those problems. To solve "e" problem, this research divided the HMM for the 2 "e" vowel. To solve the other problems, the "e" rules, the abbreviation rules, the number rules, the special character rules, and the foreign term rules are made. To evaluate the synthesizer, we employ two methods: the Mean Opinion Score (MOS) to measure the naturalness of synthesized speech; and the Semantically Unpredictable Sentence (SUS) to measure the accuracy of the synthesized speech. Result shows that the developed speech synthesizer improved the naturalness of synthesized speech. It achieves 4.1 for MOS point and 96,07 % word accuracy.

Full Text