Abstract

So far, there have been few descriptions on creating structures capable of storing lexicographic data, ISO 24613:2008 being one of the latest. Another one is by Spohr (2012), who designs a multifunctional lexical resource which is able to store data of different types of dictionaries in a user-oriented way. Technically, his design is based on the principle of a hierarchical XML/OWL (eXtensible Markup Language/Web Ontology Language) representation model. This article follows another route in describing a model based on entities and relations between them; MySQL (usually referred to as: Structured Query Language) describes a database system of tables containing data and definitions of relations between them. The model was developed in the context of the project Scientific eLexicography for Africa and the lexicographic database to be built thereof will be implemented with MySQL. The principles of the ISO model and of Spohr's model are adhered to with one major difference in the implementation strategy: we do not place the lemma in the centre of attention, but the sense description — all other elements, including the lemma, depend on the sense description. This article also describes the contained lexicographic data sets and how they have been collected from different sources. As our aim is to compile several prototypical internet dictionaries (a monolingual Northern Sotho dictionary, a bilingual learners' Xhosa–English dictionary and a bilingual Zulu–English dictionary), we describe the necessary microstructural elements for each of them and which principles we adhere to when designing different ways of accessing them. We plan to make the model and the (empty) database with all graphical user interfaces that have been developed, freely available by mid-2015.

Highlights

  • This article is concerned with the design of a lexicographic model, that is, a model of a data structure capable of storing lexicographic data, which will subsequently be used to compile several types of prototypical dictionaries for a selection of African languages1

  • We agree with Spohr (2012: 23) who states that Lexical Markup Framework2" (LMF) describes itself as interoperable, "it remains rather vague on its application in the various contexts, and in particular of its application in human usage situations"

  • We describe a lexicographic model in this article which should fulfil various requirements: (1) it should be open to a number of lexicographical functions as several different monofunctional online dictionaries will be compiled from it; (2) it should cover the specific linguistic phenomena of the languages belonging to the Bantu language family; and (3) concerning data acquisition — as we will need to populate the database with any relevant data that can be collected semi-automatically — the database should be tolerant of missing data items, even if they are considered essential for producing a dictionary

Read more

Summary

Introduction

This article is concerned with the design of a lexicographic model, that is, a model of a data structure capable of storing lexicographic data, which will subsequently be used to compile several types of prototypical dictionaries for a selection of African languages. Besides the fact that the SeLA team lacks the capacity to develop a full-scale Dictionary Writing System (DWS) or to make use of one to compile a full-scale dictionary, we consider a populated MySQL database implementation as equal to a standoff XML system In both systems, all necessary data items can be described and a number of types of relations between those data items can be modelled. We describe a lexicographic model in this article which should fulfil various requirements: (1) it should be open to a number of lexicographical functions as several different monofunctional online dictionaries will be compiled from it; (2) it should cover the specific linguistic phenomena of the languages belonging to the Bantu language family; and (3) concerning data acquisition — as we will need to populate the database with any relevant data that can be collected semi-automatically — the database should be tolerant of missing data items, even if they are considered essential for producing a dictionary. We will describe our current approach towards data acquisition and data accessibility

Aims
Aspects regarding the macrostructure and microstructure
Relational tables
Design and implementation method
Data presentation: access structure
External links
Resources to be added to the database
Available resources for the project
Other possible resources
Adding resources to the database
An example: monolingual Northern Sotho data
Summary and future work
Endnotes
10. Bibliography
141. Supplementary
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.