Abstract

This paper sheds light on an in-progress work for building a morphological analyzer for Egyptian Arabic (EGY). To build such a tool, a corpus of 527,000 EGY words is built covering different sources and genres, a tag-set is developed and about 239,000 words are morphologically annotated according to their contexts. Each annotated word is associated with its suitable Proclitic(s), Original Word Form, Tag, Enclitic(s), Glossary, Number, Gender, Definiteness, Conventional Lemma and Word Form. The conventional orthography is assigned for each word to be close to the EGY pronunciation as much as possible regardless the way a word is typically written.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.