Abstract
One approach to the transcription of written text into sounds (phonetization) is to use a set of well-defined language-dependent rules, which are in most situations augmented by a dictionary of exceptional words that constitute their on rules. The process of transcribing into sounds starts by pre-processing the text into lexical items to which the rules are applicable. The rules can be segregated into phonemic and phonetic rules. Phonemic rules operate on the graphemes to convert them into phonemes. Phonetic rules operate onto the phonemes and convert them into phones or actual sounds. Converting from written text into actual sounds and developing a comprehensive set of rules for any language is marked by several problems that have their origins in the relative lack of correspondence between the spelling of the lexical items and their sound contents. For standard Arabic (SA) these problems are not as severe as they are for English or French but they do exist. This paper presents a detailed investigation into all aspects of the phonetization of SA for the purpose of developing a comprehensive system for letter-to-sound conversion for the standard Arabic language and assessing the quality of the letter-to-sound transcription system. In particular the paper deals with the following issues: (1) investigation of the spelling and other problems of SA writing system and their impact on converting graphemes into phonemes. (2) The development of a comprehensive set of rules to be used in the transcription of graphemes into phonemes and (3) investigations of the important contextual phonetic variations of SA phonemes so as to determine viable variants (phones) of the phonemes. (4) The development of a set of rules to be used in the transcription of phonemes into phones. (5) The formulation of the rules for grapheme to phoneme and the phoneme to phone transcriptions into algorithms that lend themselves to computer-based processing. (6) An objective evaluation of the performance of the process of converting SA text into actual sounds. Phonetization of text is an important component in any natural language processing (NLP) domain that envisages text-to-speech (TTS) conversion and has applications beyond speech synthesis such as acoustic modeling for speech recognition and other natural language processing applications.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.