Abstract

Standardized resources are key components for the development of applications related to human language technology. Therefore, it is important to adopt it for designing lexical resources, especially for less commonly resourced languages such Amazigh. This language is spoken by many North African communities, including Morocco. Due to historical, geographical and sociolinguistic factors, the Amazigh language is characterized by the proliferation of many intervarieties, which has led to a complex morphology. This latter poses significant challenge to NLP tasks, especially that Amazigh language belongs to the Afro-Asiatic language (Hamito-Semitic) family, known by its non-concatenative morphology based on root and pattern. Face to the scarcity of Amazigh language resources dealing with morphemes encoding, orthographic changes, and morphotactic variations, the elaboration of a standardized lexical resource will certainly ensure a large exchange and exploitation. In this context, this paper describes ongoing work for elaborating a morphological lexicon, based on inflected forms, for the standard Moroccan Amazigh language.

Highlights

  • Amazigh language is a prominent element of the Moroccan cultural heritage

  • With the aim to provide a list of all inflected form, for the Moroccan standard Amazigh language, useful for different steps of morphological tools’ elaboration, including modelling, enrichment and evaluation, we proposed to adapt some LMF core model specification

  • In the purpose to take advantage of lexical resources, and make them useful for natural language processing (NLP) tasks, we have proposed, in this paper, the first version of a large-coverage morphological lexicon for the Moroccan standard Amazigh language

Read more

Summary

INTRODUCTION

Amazigh language is a prominent element of the Moroccan cultural heritage. it was not integrated on the education system until in 2003. Various models of lexical resources have been designed and implemented during the last decade for specific purposes. These models vary between glossaries [1,2,3,4]a and morphological lexicons of Nooj platform [5], Xerox FST tools [6] and UNL framework [7]. We have applied the LMF modelling framework for building an inflected form lexicon of the Moroccan standard Amazigh language.

Lexical Markup Framework
Historical background
Amazigh inflection features
USING LMF FOR AMAZIGH INFLECTION
Data model
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call