Abstract

Collecting phonetically balanced text corpus is an important step to develop automatic speech recognition and text-to-speech systems. A corpus should have a small number of sentences but contains all phonetic units, such as monophone, triphone, and pentaphone units. There are exist least-to-most greedy algorithm (LTM + Greedy) and its variant to select the minimum sentence set. The variant is on the sentence scoring method, which affect the number of selected sentences. In this paper, we evaluate the sentence scoring methods by Zhang and Suyanto on LTM + Greedy algorithm. The sentence scoring methods are conducted on triphone and pentaphone units on the collection of sentence set. Triphone and pentaphone units have offered higher quality synthesized speech than monophone unit. The dataset of this paper is Indonesian sentences that collected from holy book translation, news, novel, dialog, monologue, and question sentences. Totally 115,489 sentences are used for the experiments. Based on the experiments, LTM + Greedy by Suyanto produces a smaller number of sentences that contain large number of phone units.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.