Abstract

The knowledge-driven economy uses technology, thereby increasing the demand for language tools and resources to acquire and distribute the knowledge. Such tools and resources are scarce for the under resourced, spoken Bantu languages. This paper develops a computational grammar for the Ekegusii language in the Grammatical Framework (GF) to bridge the gap. The grammar development uses a bottom-up and modular-driven approach. A machine translation experiment was set up to evaluate the grammar resulting in BLEU and PER of 55.95% and 19.49%, respectively. This work contributes by providing computational grammar for an under-resourced language, thus providing a platform for analysis and synthesis, plus a machine translation within the GF ecosystem.

Highlights

  • The technology knowledge-driven economies demand natural language processing (NLP) tools and resources to acquire and distribute knowledge and information [1]

  • This paper performs grammar engineering for the Ekegusii language, an under-resourced language resulting in computational grammar using the grammatical framework (GF)

  • Position Independent Error Rate (PER) and Word Error Rate (WER) based on Levenshtein distance [35] are excellent metrics to investigate Ekegusii errors since this language has a lot of nasal insertion, deletion and substitution, especially the joining of morphemes at the word level

Read more

Summary

INTRODUCTION

The technology knowledge-driven economies demand natural language processing (NLP) tools and resources to acquire and distribute knowledge and information [1]. It has a single abstract syntax that defines a set of categories (Cat) of trees, a set of functions (Fun) to implement those trees plus their type and start category [10]. It has many parallel concrete syntaxes, one for each language grammar. These syntaxes define linearization of both the categories (lincat) and the function (lin) stated in the abstract syntax as exemplified using category Noun (N) with string “house” below [7]. The above survey demonstrates that little work has been done to develop NLP tools and resources for this language; this computational grammar will be a significant effort

EKEGUSII DESCRIPTIVE GRAMMAR
Syntax
IMPLEMENTING EKEGUSII GRAMMAR IN GF
Morphology
G2 G3 G4 G5 G6 G7 G8 G9 G10 G11
RESULTS AND DISCUSSION
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.