Abstract

The Arabic script has a set of optional diacritics (taškīl) that primarily indicate short vowels. These diacritics are used to varying extents, giving a form of orthographic variation potentially affecting every word in a text and various aspects of the reading process. This study is the first empirical investigation into the variation in how Arabic diacritics are used. It employs quantitative corpus linguistic methods to explore diacritization in a 72-million-word corpus consisting of book-length texts of various genres. Children’s literature and poetry were found to vary considerably in the number of diacritics used, while books of normal prose fall within a narrow range of limited use of diacritics. Furthermore, the different diacritics, subdivided by function, were found to follow a hierarchical order of priority that is largely consistent across genres. These findings call into question common descriptions of the Arabic writing system as binarily diacritized or undiacritized. Further lines of research based on these findings are suggested.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call