Towards Zulu corpus clean-up, lexicon development and corpus annotation by means of computational morphological analysis

Sonja Bosch,Laurette Pretorius

doi:10.1080/02572117.2019.12063275

Abstract

This article reports on a practical, semi-automated procedure towards creating a clean, morphologically annotated Zulu corpus of tractable size that could eventually serve both as a gold standard for Zulu computational morphology and as basis for further linguistic annotation. A corpus development architecture is proposed which includes the corpus in various stages of development, a pre-processing module, the Zulu morphological analyser and its guesser variant, the machine-readable lexicon that serves as comprehensive lexical database for Zulu, and a human elicitation function for ensuring the integrity of the lexical database. The approach is novel in the sense that an existing rule-based, finitestate Zulu computational morphological analyser is used as a core technology in this procedure to facilitate the complex, agglutinative nature of Zulu morphology. The corpus, at present consisting of the Zulu version of the South African Constitution, will have morphological analysis and tagging as a first level of annotation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Towards Zulu corpus clean-up, lexicon development and corpus annotation by means of computational morphological analysis

Abstract

Talk to us

Similar Papers

More From: South African Journal of African Languages

Lead the way for us

Journal: South African Journal of African Languages	Publication Date: Jan 1, 2011
Citations: 3

Similar Papers

Improving the Computational Morphological Analysis of a Swahili Corpus for Lexicographic Purposes
G De Pauw ... G-M De Schryver
Lexikos | VOL. 18
G De Pauw, et. al.G De Pauw ... G-M De Schryver
27 Oct 2009
Lexikos | VOL. 18

A Computational Analysis of Arabic Noun Morphology
Hala Mohamed Osman Salih ... Malladi Revathi Devi
International Journal of Linguistics, Literature and Translation | VOL. 6
Hala Mohamed Osman Salih, et. al.Hala Mohamed Osman Salih ... Malladi Revathi Devi
11 Mar 2023
International Journal of Linguistics, Literature and Translation | VOL. 6

The significance of computational morphological for Zulu lexicography
Sonja E Bosch ... Laurette Pretorius
South African Journal of African Languages | VOL. 22
Sonja E Bosch, et. al.Sonja E Bosch ... Laurette Pretorius
01 Jan 2002
South African Journal of African Languages | VOL. 22

Insights into the pathogenesis of cerebral fusiform aneurysms: high-resolution MRI and computational analysis
Ryan Phillip Sabotin ... Adam E Galloy
Journal of NeuroInterventional Surgery | VOL. 13
Ryan Phillip Sabotin, et. al.Ryan Phillip Sabotin ... Adam E Galloy
25 Feb 2021
Journal of NeuroInterventional Surgery | VOL. 13

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Towards Zulu corpus clean-up, lexicon development and corpus annotation by means of computational morphological analysis

Abstract

Talk to us

Similar Papers

More From: South African Journal of African Languages