Abstract

The Arabic language has many complex grammar rules that may seem complicated to the average user or learner. Automatic grammar checking systems can improve the quality of the text, reduce the costs of the proofreading process, and play a role in grammar teaching. This paper presents an initiative toward developing a novel and comprehensive Arabic auditor that can address vowelized texts. We called the “Arabic Grammar Detector” (AGD-أَجِــدْ). AGD was successfully implemented based on a dependency grammar and decision tree classifier model. Its purpose is to extract patterns of grammatical rules from a projective dependency graph in order to designate the appropriate syntax dependencies of a sentence. The current implementation covers almost all regular Arabic grammar rules for nonvowelized texts as well as partially or fully vowelized texts. AGD was evaluated using the Tashkeela corpus. It can detect more than 94% of grammatical errors and hint at their causes and possible corrections.

Highlights

  • As with all languages, researchers into the Arabic language and its applications have exerted strenuous efforts to achieve progress in language processing. ese efforts have focused on multiple levels, including morphology [1, 2], which studies and characterizes the structure of words, syntax [3, 4], the grammatical arrangement of words, and semantics [5, 6], which determines a text’s exact meaning

  • We attempt here to bridge the research gap by implementing a comprehensive supervised learning system for detecting grammatical errors and hint at their causes and correction in diacritic and nondiacritic Arabic text. e dependency grammar of the Arabic grammatical rules adopted in this research enables us to parse the Arabic structure graph to infer the correct pattern of grammatical rules for a sentence based on the properties of its words, which are extracted via a morphological analyzer

  • During the construction of the AGD, some errors may not be discovered due to incorrect analysis in the preprocessing operation, or due to ambiguity in semantics. ese errors are outside the framework of the AGD because they happened in the preprocessing stage that precedes the grammar auditor

Read more

Summary

Introduction

Researchers into the Arabic language and its applications have exerted strenuous efforts to achieve progress in language processing. ese efforts have focused on multiple levels, including morphology [1, 2], which studies and characterizes the structure of words, syntax [3, 4], the grammatical arrangement of words, and semantics [5, 6], which determines a text’s exact meaning. The flexible arrangement of words in a sentence, the properties of agglutination, and diacritics complicated Arabic grammar All these properties lead to a variety of issues on the morphological, syntax, and semantic levels. E dependency grammar of the Arabic grammatical rules adopted in this research enables us to parse the Arabic structure graph to infer the correct pattern of grammatical rules for a sentence based on the properties of its words, which are extracted via a morphological analyzer. E second study on Arabic grammar checkers was presented by Moukrim et al [35] Their system uses the Arabic grammar described in the ontology [36] to generate constraints and sentence rules. One of the tasks of language studies is to identify the proper grammatical syntax for every sentence within a specific formalism and grammar [39]. This backbone of the hierarchy can be extended (by adding subrules) to include several extensions and enhancements intended to facilitate and improve the usage in certain applications

Syntax Dependency and Classification Model
Evaluation
Stage 1
Stage 2
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call