Abstract
Corpus-based terminology is currently gaining ground on the international front. It is therefore important that terminologists working on the South African Bantu languages not only take note of this development, but that they should also follow this trend, even if they do not have the same measure of access to highly sophisticated software. The aim of this article is therefore to establish whether it is possible to retrieve definitional information on key concepts from untagged, running text by making use of affordable and easily accessible software such as WordSmith Tools. In order to answer this question, a case study is done in Northern Sotho, using textual material on linguistics as basis for a special field corpus. Syntactic and lexical patterns serving as textual markers of definitional information are identified and the success rate of the computational retrieval of definitional information is analysed and evaluated. Attention is also paid to the retrieval of specifically conceptual information, which turned out to be a fortunate by-product of semi-automatic retrieval of definitional information. Finally, it is illustrated how definitional information retrieved can be utilised in the writing of a formal terminological definition. Keywords: terminology, south african bantu languages, definitional information, semi-automatic information retrieval, terminological definitions, conceptual relationships, lexical patterns, syntactic patterns, textual markers, keyword-in-context (kwic), wordsmith tools
Highlights
Opsomming: Semi-outomatiese herwinning van definisie-inligting: 'n NoordSothogevallestudie
The process of semi-automatic retrieval of definitional information does reduce the time that has to be spent on consultation with special field experts, which in turn, might make them more willing to participate in terminology projects
A case study was done for Northern Sotho, using a special field corpus on linguistics
Summary
The feasibility of retrieving definitional information semi-automatically from special field corpora is investigated. For the purpose of this study, the term 'definitional information' is used to refer to any information to be found in an electronic special field corpus regarding the meaning and usage of a term, as well as the conceptual relationships it has with other terms In this regard, two issues will be addressed. In the first instance, Pearson (1998: 5) states that authors writing within certain specified communicative settings are likely to provide explanations of at least some of the terms they use This hypothesis is tested with regard to Northern Sotho, using a special purpose corpus consisting of a collection of texts on linguistics as authentic data. Before these issues are addressed, current methodological options open to South African Bantu language terminologists with regard to the generating of definitional information are investigated
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.