An Arabic Text-To-Speech System Based on Artificial Neural Networks

Al-Said Al-Said

doi:10.3844/jcssp.2009.207.213

Abstract

Problem statement: With the rapid advancement in information technolo gy and communications, computer systems increasingly offer the users the opportunity to interact with information through speech. The interest in speech synthesis and in building voices is increasing. Worldwide, speech synthesizers have been developed for many popular languages English, Spanish and French and many researches and developments have been applied to those languages. Arabic on the other hand, has been given little attention com pared to other languages of similar importance and the research in Arabic is still in its infancy. Bas ed on these ideas, we introduced a system to transf orm Arabic text that was retrieved from a search engine into spoken words. Approach: We designed a text- to-speech system in which we used concatenative speech synthesis approach to synthesize Arabic text. The synthesizer was based on artificial neural netw orks, specifically the unsupervised learning paradigm. Different sizes of speech units had been used to produce spoken utterances, which are words, diphones and triphones. We also built a dict ionary of 500 common words of Arabic. The smaller speech units (diphones and triphones) used for synthesis were chosen to achieve unlimited vocabulary of speech, while the word units were use d for synthesizing limited set of sentences. Results: The system showed very high accuracy in synthesizing the Arabic text and the output speech was highly intelligible. For the word and diphone u nit experiments, we could reach an accuracy of 99% while for the triphone units we reached an accu racy of 86.5%. Conclusion: An Arabic text-to- speech synthesizer was built with the ability to pr oduce unlimited number of words with high quality voice.

Highlights

A Text-To-Speech synthesizer (TTS) is a computer-based program in which the system processes through the text and reads it aloud
The speech synthesizer consists of two main components, namely: the text processing component and the Digital Signal Processing (DSP) module
The text processing component has two major tasks. It converts raw text containing symbols like numbers and abbreviations into the equivalent of written-out words, this process is often called text normalization. It converts the text into some other representation and output it to the DSP module or synthesizer, which transforms the symbolic information it receives into speech

Summary

Introduction

A Text-To-Speech synthesizer (TTS) is a computer-based program in which the system processes through the text and reads it aloud. There is a demand on the technology to deliver good and acceptable quality of speech. The speech synthesizer consists of two main components, namely: the text processing component and the Digital Signal Processing (DSP) module. The text processing component has two major tasks. It converts raw text containing symbols like numbers and abbreviations into the equivalent of written-out words, this process is often called text normalization. It converts the text into some other representation and output it to the DSP module or synthesizer, which transforms the symbolic information it receives into speech

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Computer Science	Publication Date: Mar 1, 2009
Citations: 8	License type: cc-by

R Discovery Prime

R Discovery Prime

An Arabic Text-To-Speech System Based on Artificial Neural Networks

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Computer Science

Lead the way for us

Similar Papers

On the inter-relations between artificial and physiological neural networks
Daniel Graupe ... Boris Vern
Neurological Research | VOL. 23
Daniel Graupe, et. al.Daniel Graupe ... Boris Vern
01 Jul 2001
Neurological Research | VOL. 23

Advances in Information Technology Integrated with Strategic Direction
Susan A Peterson
-
Susan A PetersonSusan A Peterson
24 May 2019
24 May 2019

Basic units selection for a speech synthesis system

-

19 Dec 2003
19 Dec 2003

Two Faces of Comprehensive Information Technology, Knowledge Acquisition, and Polarized Wage Structure
Sung-Min Kim
Journal of Economic Research (JER) | VOL. 17
Sung-Min Kim Sung-Min Kim
01 Aug 2012
Journal of Economic Research (JER) | VOL. 17

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Arabic Text-To-Speech System Based on Artificial Neural Networks

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Computer Science