Are alternative meanings of an Arabic homograph activated even when it is disambiguated by vowel diacritics?
The diacritical markers that represent most of the vowels in the Arabic orthography are typically omitted from written texts, thereby making many Arabic words phonologically and semantically ambigu...
- Research Article
14
- 10.1007/s11145-016-9677-1
- Aug 20, 2016
- Reading and Writing
The diacritical markers that represent most of the vowels in the Arabic orthography are generally omitted from written texts. Previous research revealed that the absence of diacritics reduces reading comprehension performance even by skilled readers of Arabic. One possible explanation is that many Arabic words become ambiguous when diacritics are missing. Words of this kind are known as heterophonic homographs and are associated with at least two different pronunciations and meanings when written without diacritics. The aim of the two experiments reported in this study was to investigate whether the presence of diacritics improves the comprehension of all written words, or whether the effects are confined to heterophonic homographs. In Experiment 1, adult readers of Arabic were asked to decide whether written words had a living meaning. The materials included heterophonic homographs that had one living and one non-living meaning. Results showed that diacritics significantly increased the accuracy of semantic decisions about ambiguous words but had no effect on the accuracy of decisions about unambiguous words. Consistent results were observed in Experiment 2 where the materials comprised sentences rather than single words. Overall, the findings suggest that diacritics improve the comprehension of heterophonic homographs by facilitating access to semantic representations that would otherwise be difficult to access from print.
- Conference Article
81
- 10.3115/1621804.1621813
- Jan 1, 2004
This paper discusses several issues in Arabic orthography that were encountered in the process of performing morphology analysis and POS tagging of 542,543 Arabic words in three newswire corpora at the LDC during 2002--2004, by means of the Buckwalter Arabic Morphological Analyzer. The most important issues involved variation in the orthography of Modern Standard Arabic that called for specific changes to the Analyzer algorithm, and also a more rigorous definition of typographic errors. Some orthographic anomalies had a direct impact on word tokenization, which in turn affected the morphology analysis and assignment of POS tags.
- Research Article
1
- 10.4304/tpls.3.9.1497-1508
- Sep 1, 2013
- Theory and Practice in Language Studies
Reading and writing disabilities and generalized cognitive dysfunction are developmental in origin and are likely linked to abnormalities in brain function. In this article, we detail selective reading and writing disturbances in the spoken and written Arabic orthography of an Arab teenager (RL) who communicates with his friends via readable and understandable electronic messages. We examine the performance of RL, who was diagnosed as learning disabled, in his reading and writing of Arabic words and text in Latin orthography compared to his reading and writing in Arabic orthography. Cognitive and verbal abilities in Arabic and Latin electronic orthography were tested using traditional pen and paper as well as electronic devices. The results underline the importance of the effect of the type of Arabic orthography on reading and writing fluency. Index Terms—learning disability, reading, diglossia, Arabic, orthography, Latin, electronic
- Research Article
5
- 10.3844/jcssp.2020.956.965
- Jul 1, 2020
- Journal of Computer Science
Recently, instant translator applications would be a very useful applications when traveling especially when one knows little about the language of the country she/he is traveling to. Arabic to English instant translation has not yet been made available by most applications. In this Article, we attempt to provide mimic way of an application for instant Arabic to English translation. The system provides translation for Arabic Triangle of Language (ālmṯlṯātāllġwyh) which includes Arabic three words that are homographs. The process starts by capturing an image for a homograph using a mobile phone camera, after that the captured word is recognized, taking the diacritic markers into consideration, using an Arabic Optical Character Recognition (OCR). Finally, the system provides an English translation to the homograph. The researchers made use of Histograms of Oriented Gradients (HOG) features and a set of structural and geometrical features of Arabic word segmented and the SVM (multi class) classifier for classification, then providing the English meaning.
- Research Article
127
- 10.1111/1467-9817.00177
- Oct 1, 2002
- Journal of Research in Reading
The reading process in Arabic as a function of vowels and sentence context is reviewed. Reading accuracy and reading comprehension results are reviewed in the light of cross–cultural reading, in order to develop a more comprehensive reading theory. Phonology, morphology and sentence context are considered key variables in explaining the reading process in Arabic orthography. Phonology (in the form of short vowels) affects reading accuracy as well as reading comprehension, regardless of reading level, age, material and reading conditions. Initial visual–orthographic processing identifies the morphology (i.e. the triliteral/quadriliteral roots of Arabic words) which then enables access to the mental lexicon. Sentence context is also essential in reading Arabic orthography regardless of the reader’s level, age, material and reading condition. The phonology, morphology and sentence context of Arabic are presented in two suggested reading models for poor/beginner Arabic readers and for skilled Arabic readers.
- Research Article
43
- 10.1080/17586801.2015.1114910
- Feb 18, 2016
- Writing Systems Research
ABSTRACTThe current study tested the impact of vowelisation on reading speed and accuracy of Arabic words among skilled and poor native Arabic readers using a cross-sectional procedure. One hundred and forty-three skilled and 146 poor native Arab readers from northern Israel (second, fourth and sixth grades) read two lists of full vowelised and non-vowelised words. The results indicate that among the readers, the non-vowelised words were read more accurately than the vowelised words. For the skilled poor readers, such significant differences were found within the older reader groups only (the fourth and sixth grades). Differences in the speed of reading vowelised and non-vowelised words were found within the older groups only in both groups of readers. The results are discussed in light of different approaches in the field of visual word recognition. It is suggested that vowelisation for skilled and older readers could cause a visual load during the process of the visual recognition of words and may be co...
- Conference Article
- 10.5339/qfarc.2018.ssahpd880
- Jan 1, 2018
Building a Rich Lexical Resource for Standard Arabic
- Research Article
- 10.32996/fcsai.2025.2.2.1
- Nov 25, 2025
- Frontiers in Computer Science and Artificial Intelligence
Arabic has three long vowels /a:/, /u:/, /i:/ and three short vowel /a/, /u/, /i/ which are represented by diacritics marked over and under consonant letters. In words that have short vowels, only the consonants are written. Arabs usually read without any diacritics except for the Holy Quran and the Prophet’s Hadiths. The absence of short vowel diacritics poses pronunciation difficulties for Artificial Intelligence (AI). Henceforth, this study aims to anlayze a sample of Arabic YouTube videos narrated by AI to find out which Arabic words are mispronounced by AI narrators. It was noted that AI narrators speak with a natural voice, good expression and intonation. They make no grammatical or syntactic errors. But they make pronunciation errors, especially in diacritics and homographs (words that are spelled the same but have different pronunciations and meanings depending on the short vowel diacritics which are not usually shown on written words). This means that AI has difficulty matching the pronunciation of a homophone with the context in which it is used. They confuse short vowel diacritics on the suffix /ta/, /ti/ /tu/ suffix تاء التأنيث when it refers to first, second, or third person, masculine or feminine, imperative and past tense (كَتَبَ كُتِب كُتُب كَتّبَ كُتّب كتبتَ كتبتُ كتَبتِ كُتِبَتْ كُتّبت). This affects comprehension in L2 learners and causes cacophony and distortion for native speakers and non-native speakers of Arabic. The article sheds light on how AI reads Arabic aloud, classification of pronunciation errors in AI narration; variations in the type and frequency of pronunciation errors across videos; why AI makes mistakes in pronouncing Arabic but does not make grammatical or syntactic errors in AI-narrated content; and how AI pulls off realistic intonation. Additionally, the article gives suggestions for improvement and recommendation for students learning Arabic as a foreign language.
- Research Article
53
- 10.4236/ojml.2013.31005
- Jan 1, 2013
- Open Journal of Modern Linguistics
This study aimed to examine the effects of visual characteristics of Arabic orthography on learning to read compared to Hebrew among Arabic and Hebrew bilinguals in an elementary bilingual education framework. Speed and accuracy measures were examined in reading words and non-words in Arabic and Hebrew as follows: Arabic words and non-words composed of connected and similar letters, words and non-words composed of connected and non-similar letters, and words and non-words composed of unconnected letters. In Hebrew, words and non-words composed of similar letters and non-similar letters. It was found that Arabic speakers showed an almost equal control in all reading tasks in both languages whereas, Hebrew speakers showed better performance in their mother tongue in all reading tasks. In Arabic, the best performance was in reading words and non-words that was unconnected. Based on these findings, it was concluded that Hebrew speakers did not succeed in transferring their good ability in reading their mother tongue to reading the second language, apparently due to the unique nature of the Arabic orthography. Our findings with regard to the cross-linguistic research literature as well as the specific features of Arabic language are discussed.
- Research Article
3
- 10.4197/eng.16-2.9
- Jan 1, 2005
- Journal of King Abdulaziz University-Engineering Sciences
Since the invention of the International Phonetic Alphabet (IPA) in 1886, it remains the only alphabet that can represent all the sounds of world languages. However, the IPA symbols are based on the Roman letters. This means that speakers of languages which possess different orthography are not able to use them. For example, Arab linguists and researchers find it almost impossible to use IPA when they write in Arabic script; Arabic orthography differs in shape and direction of writing. In addition, IPA does not have symbols for the emphatic sounds: / ,/ط/ ,/ض/ ,/ص ظ/ /. To make it possible for Arab linguists and researcher who work on language sounds, a new alphabet has been designed. This paper is to present Arabic International Phonetic Alphabet (AIPA). AIPA consists of symbols that are based on Arabic orthographic system. It covers all the symbols in IPA in addition to some Arabic sounds which do not have representations in IPA. AIPA has been designed and now is available as Fonts. The AIPA fonts can be used in typing and entering linguistic data to computers. AIPA symbols differ from Arabic orthography in terms of independency; each symbol is not connected to the adjacent symbols. So, an Arabic word such as صوتية is written as: .\صاوتئ2ياه
- Research Article
108
- 10.1093/wsr/wsr014
- Jan 1, 2011
- Writing Systems Research
Previous research has suggested that reading Arabic is slower than reading Hebrew or English, even among native Arabic readers. We tested the hypothesis that at least part of the difficulty in reading Arabic is due to the visual complexity of Arabic orthography. Third- and sixth-grade native readers of Arabic who were studying Hebrew in school were asked to detect a vowel diacritic in the context of Hebrew words and nonwords, Arabic words and nonwords (including connected and unconnected Arabic letters), and nonletter stimuli that resembled Arabic or Hebrew letters. Participants were better at detecting target vowels in Hebrew than in any of the Arabic conditions. Moreover, target detection in Arabic was better for letter strings containing connected letters than for those containing unconnected letters. The findings extend previous results on Hebrew versus Arabic reading and support a perceptual load account of the source of processing difficulty in reading Arabic. Performance in the Arabic conditions di...
- Research Article
2
- 10.1515/comp-2018-0017
- Dec 1, 2018
- Open Computer Science
We demonstrate several ways to use morphological word analogies to examine the representation of complex words in semantic vector spaces. We present a set of morphological relations, each of which can be used to generate many word analogies. 1. We show that the difference-vectors for pairs which have the same relation to each other are similarly aligned. 2. We suggest that addition of difference-vectors is a useful phrase-building operator. 3. We propose that pairs in the same relation may have similar relative frequencies. 4. We suggest that homographs, which necessarily have the same semantic vectors, can sometimes be separated into different vectors for different senses, using frequency estimates and alignment constraints obtained from word analogies. 5. We observe that some of our analogies seem to be parallel, and might be combined. We use Arabic words as a case study, because Arabic orthography includes verb conjugations, object pronouns, definitive articles, possessive pronouns, and some prepositions in single word-forms. Therefore, a number of short phrases, built up of easily perceived constituents, are already present in stock semantic spaces for Arabic available on the web. Similar phrases in English would require including bigrams or trigrams as lemmas in the word embedding, although English derivational morphology allows for other relationships in standard semantic spaces which Arabic does not, for example negation. We make our corpus of morphological relations available to other researchers.
- Research Article
- 10.32890/mjli.12.2015.7679
- Jan 1, 2015
- Malaysian Journal of Learning and Instruction
Purpose – This study aims to investigate the use of diacritics in the Arabic script of Malay to facilitate Arab postgraduate students of UKM to read the Malay words accurately. It is hypothesised that the Arabic script could facilitate the reading of Malay words among the Arab students because of their earlier exposure to the Arabic script in comparison to the Romanised script. Method – Twelve Arabic fi rst language speakers participated in a reading experiment that used DMDX, a Win 32-based display system for psychological experiments, to investigate whether or not Arabic vowel diacritics can facilitate Arabic fi rst language speakers to read Malay words accurately. A total of 100 Malay bi-syllabic words were used as stimuli in three different forms: 1) Arabic script without diacritics; 2) Arabic script with diacritics; and 3) Romanised script. The participants’ responses and reaction times were recorded to analyse accuracy and speed. Findings – Arabic first language speakers were more accurate when reading words in Arabic script of Malay with diacritics and when reading Romanised script than when reading Arabic words without diacritics. Arabic speakers read Malay words faster in Arabic script without diacritics and in Romanised scripts than when reading words in Arabic script with diacritics. Significance – This study shows that the use of a more familiar script to a certain extent does facilitate language learners to produce the target language more accurately compared to using a less familiar script. Hence, educators should explore any possible means to scaffold learners in their learning process.
- Research Article
27
- 10.1080/17586801.2013.834244
- Oct 1, 2013
- Writing Systems Research
Previous research has suggested that reading Arabic is more challenging than reading Hebrew or English, even among native Arabic readers due to the visual complexity of the Arabic orthography. In particular, the fact that most of the Arabic letters connect to each other and change their basic form according to their place in the written word (beginning, middle or end) has been hypothesised to constitute a visual load affecting reading efficiency. Here, we tested this visual complexity hypothesis by manipulating word-internal orthographic connectivity during visual word recognition. Fifty-eight adult skilled readers and 20 disabled readers of Arabic performed a lexical-decision task using words (and nonwords) whose letters were naturally fully connected (Cw), partially connected (PCw) and nonconnected (NCw). Behavioural measures for words as a function of word connectivity (and word frequency) were analysed using repeated measures analysis of variance. The results revealed that within both groups of reader...
- Research Article
52
- 10.4236/ce.2013.44036
- Jan 1, 2013
- Creative Education
The aim of this study was to examine the effect of vowelization on reading Arabic orthography. Native children speakers of Arabic were asked to read aloud words (vowelized and unvowelized) and pseudowords. The results showed that unvowelized words were read aloud more quickly and more accurately than the shallow fully vowelized Arabic words. The disadvantage of vowelized words in both speed and accuracy was therefore unexpected, and, furthermore, inconsistent with findings from several other relevant studies. The results suggested that Arab children used a different perceptual and coding strategy when the stimuli differ in their lexical feature (word vs pseudoword) and visual/orthographic feature (vowelized vs unvowelized).