Abstract

When compiling a dictionary, a lexicographer has a set of decisions to make — starting with drawing up a lemma list to such issues as formatting a dictionary entry. Relying on corpus data while designing a lemma list and describing entries is standard in present lexicography, but there are still decisions — like the choice of a lemma or how to treat derivatives — that are often intuition-based. This article aims to investigate whether decisions put forward in Swahili dictionaries comply with users' expectations. We analyse log files from the new Swahili–Polish dictionary to investigate why looking up words goes wrong, and evaluate the choice of a lemma and the treatment of derivatives in Swahili dictionaries. Based on such data we intend to expand or modify the existing electronic dictionary to adapt to users' level of grammar and dictionary structure knowledge. During this research we identified a list of lemma lacuna that cause the majority of unsuccessful Swahili searches. The study shows that users know and understand the lemmatisation strategy of the dictionary but also reveals which word forms cause the most problems and how the lemma list of Swahili dictionaries could be expanded.

Highlights

  • When a new dictionary is compiled a reference to a corpus is a standard procedure

  • In an electronic dictionary it is often the lemma itself that provides access to the article, and the choice of lemma is a crucial decision the lexicographer has to make. This is especially true in case of Bantu languages where lemma is not intuitive and in some cases not identical to any word forms

  • In our study of log files from the Swahili–Polish dictionary we revealed some problems caused by the searching method that requires users to choose the language of their search and to lemmatise word forms

Read more

Summary

Introduction

When a new dictionary is compiled a reference to a corpus is a standard procedure (cf. De Schryver et al 2006). In an electronic dictionary it is often the lemma itself that provides access to the article, and the choice of lemma is a crucial decision the lexicographer has to make This is especially true in case of Bantu languages where lemma is not intuitive and in some cases not identical to any word forms. All lexicographic decisions have to be taken with the user in mind and especially the users' skills must be taken into account (cf Atkins and Rundell 2008, Prinsloo and De Schryver 1999) Building on this assumption, a new Swahili–Polish dictionary was created and posted online as a student resource. We intend to research whether users know how to access Swahili dictionary articles — that is their lemma choices as compared to dictionary lemma list. We intend to investigate the reasons why looking up words does not always goes well

Log files as a tool of dictionary user research
Citation-forms in Swahili dictionaries
The treatment of derivatives in Swahili dictionaries
Log files of a Swahili–Polish dictionary
The analysis of the searches
Conclusion
Dictionaries and corpora
Findings
Other literature
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call