Abstract

A text-to-speech (TTS) system converts the texts into speech in a specific language. Several TTS systems generate natural-like speech signals in numerous languages, such as English. On the other hand, the Kurdish language has just been examined. Existing preliminary research on Kurdish speech synthesis has utilized old methods and has generated low-quality speech. They also lack important aspects of speech, including intonation, emphasis, and rhythm. Some approaches were presented to address these challenges, including the use of concatenative systems. For example, the unit selection or statistical parametric methods. On the other hand, they need a great deal of time, effort, and domain knowledge. An additional factor for Kurdish speech synthesizers' low performance is the absence of publicly available speech corpora, unlike English, which has many freely-available corpora and audiobooks. The motivation of this paper is to create a Central Kurdish speech corpus and generate a human-like speech from the Kurdish text. This paper explains how to utilize Tacotron 2, an end-to-end neural network architecture and HiFi-GAN vocoder, to produce a high-quality, realistic, and human-like Kurdish voice. This work utilizes "text, audio" pairings, which contain 10 hours of recorded audio samples and texts collected from the Internet and textbooks. It shows how to use English character embedding as the pre-trained knowledge with Kurdish characters as input and how to preprocess these audio examples to get a great outcome. Our evaluations for various types of texts show a mean opinion score of 4.1, comparable with state-of-the-art synthesizers in other languages.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.