Abstract

Corpus-based terminology is currently gaining ground on the international front. It is therefore important that terminologists working on the South African Bantu languages not only take note of this development, but that they should also follow this trend, even if they do not have the same measure of access to highly sophisticated software. The aim of this article is therefore to establish whether it is possible to retrieve definitional information on key concepts from untagged, running text by making use of affordable and easily accessible software such as WordSmith Tools. In order to answer this question, a case study is done in Northern Sotho, using textual material on linguistics as basis for a special field corpus. Syntactic and lexical patterns serving as textual markers of definitional information are identified and the success rate of the computational retrieval of definitional information is analysed and evaluated. Attention is also paid to the retrieval of specifically conceptual information, which turned out to be a fortunate by-product of semi-automatic retrieval of definitional information. Finally, it is illustrated how definitional information retrieved can be utilised in the writing of a formal terminological definition. Keywords: terminology, south african bantu languages, definitional information, semi-automatic information retrieval, terminological definitions, conceptual relationships, lexical patterns, syntactic patterns, textual markers, keyword-in-context (kwic), wordsmith tools

Highlights

  • Opsomming: Semi-outomatiese herwinning van definisie-inligting: 'n NoordSothogevallestudie

  • The process of semi-automatic retrieval of definitional information does reduce the time that has to be spent on consultation with special field experts, which in turn, might make them more willing to participate in terminology projects

  • A case study was done for Northern Sotho, using a special field corpus on linguistics

Read more

Summary

Rationale

The feasibility of retrieving definitional information semi-automatically from special field corpora is investigated. For the purpose of this study, the term 'definitional information' is used to refer to any information to be found in an electronic special field corpus regarding the meaning and usage of a term, as well as the conceptual relationships it has with other terms In this regard, two issues will be addressed. In the first instance, Pearson (1998: 5) states that authors writing within certain specified communicative settings are likely to provide explanations of at least some of the terms they use This hypothesis is tested with regard to Northern Sotho, using a special purpose corpus consisting of a collection of texts on linguistics as authentic data. Before these issues are addressed, current methodological options open to South African Bantu language terminologists with regard to the generating of definitional information are investigated

Generating definitional information: the current South African scenario
Compilation of an electronic special field corpus
Identification of 50 single word test terms
Isolating Concordance lines for the 50 test terms
Identification of textual markers of definitional information
Lexical and syntactic markers of definitional information in Northern Sotho
Analysis of results
Retrieval of information on conceptual relationships
Findings
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.