Semi-automating the Reading Programme for a Historical Dictionary Project

Tim Van Niekerk,Johannes Schäfer,Ulrich Heid

doi:10.5788/28-1-1468

Tim Van Niekerk, Johannes Schäfer + Show 1 more

Open Access

https://doi.org/10.5788/28-1-1468

Copy DOI

Journal: Lexikos	Publication Date: Dec 1, 2018
License type: cc-by

Affiliation: Rhodes University, University of Hildesheim

Abstract

This paper describes the resources and software procedures used or developed in a major enabling step towards the revision of the scholarly reference work A Dictionary of South African English on Historical Principles ( DSAE , Silva et al. 1996), namely the semi-automatic generation of a digitally-sourced lexical database on which new and updated dictionary entries will be based; as well as the addition, in parallel, of a new corpus of South African English (SAE) to the project. Drawing on online data sources and an extensive list of known SAE word forms, we have developed a software toolchain to gather, encode, annotate and collate textual sources, producing: (i) a 3.1-billion part-of-speech-annotated corpus of South African English; (ii) a lexical database of illustrative quotations for over 20,000 known SAE word forms, available for selection at the entry-revision stage; and (iii) a list of potential new variant spellings and headword inclusion candidates. These steps replace, where recent electronic sources are concerned, the mechanical aspects of quotation gathering, normally undertaken manually through a reading programme requiring years of teamwork to acquire sufficient coverage (cf. Hicks 2010).

Highlights

Opsomming: Die semi-outomatisering van die leesprogramme van 'n historiese woordeboekprojek
A Dictionary of South African English on Historical Principles (DSAE, Silva et al 1996) is a diachronic variety dictionary, first published as a single-volume print dictionary spanning about 800 pages and available as a pilot online edition at http:// dsae.co.za since 2014
Much of the DSAE's compilation process was directed towards an ongoing reading programme

Summary

Role of quotations in the dictionary

A Dictionary of South African English on Historical Principles (DSAE, Silva et al 1996) is a diachronic variety dictionary, first published as a single-volume print dictionary spanning about 800 pages and available as a pilot online edition at http:// dsae.co.za since 2014. With the help of numerous volunteer readers, approximately 300,000 index card citations were collected as illustrative evidence for dictionary entries, their sense-divisions as they evolve through time, and nested lemmas. Of these about 45,000 quotations were included in the printed version of the dictionary, resulting in an average of 10 quotations per entry and producing a full running text of about 1,5 million words.

The need for new quotations

Typical quotation-gathering stages

Input data sources

Newspaper Corpus

Web Corpus

Annotated corpus and corpus query system

General overview

Input: SAE dictionary search list

Analysis of new headword candidates unrecognised by the TreeTagger

Detection of new variants based on word similarity

Detection of new headword candidates based on word similarity

Detection of headword candidates using term extraction

Re-orientation of reading programme prompted by semi-automation

Conclusion

Findings

10. References

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Semi-automating the Reading Programme for a Historical Dictionary Project

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Lexikos

Lead the way for us

Similar Papers

Where does a New English dictionary stop? On the making of the Dictionary of South African Indian English
Rajend Mesthrie
English Today | VOL. 29
Rajend MesthrieRajend Mesthrie
27 Feb 2013
English Today | VOL. 29

Firming up the Foundations: Reflections on Verifying the 248 Quotations in a Historical Dict ionary, with Reference to "A Dictionary of South African English on Historical Principles"
S Hicks
Lexikos | VOL. 20
S HicksS Hicks
13 Dec 2010
Lexikos | VOL. 20

A Dictionary of South African English (review)
David L Gold
Dictionaries: Journal of the Dictionary Society of North America | VOL. 11
David L GoldDavid L Gold
01 Jan 1989
Dictionaries: Journal of the Dictionary Society of North America | VOL. 11

Adapting a Historical Dictionary for the Modern Online User: The Case of the Dictionary of South African English on Historical Principles's Presentation and Navigation Features
André Du Plessis ... Tim Van Niekerk
Lexikos | VOL. 26
André Du Plessis, et. al.André Du Plessis ... Tim Van Niekerk
01 Nov 2016
Lexikos | VOL. 26

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Semi-automating the Reading Programme for a Historical Dictionary Project

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Lexikos