Syntax of languages is understood to be shaped by syntactic universals principles. Despite operating within the constraints of these universals, languages have managed to display their unique syntactic features. This is explained by specificity in the ranking of these universals across languages. To establish the unique features, researchers have to use data collected using a variety of methods some of which are corpus studies, linguistic elicitation, introspection and experimentation. Each of these methods requires a research tool whose development or adoption is dependent on the research question(s)formulated to fill a study gap. The use of corpus construct to generate data, as other research tools, requires an understanding of what type of data is needed to answer which questions on which linguistic features. The construction of a corpus can be done from plain texts or annotated texts. The question is: how can corpus construct be used as a tool in the study of languages whose corpus is yet to be compiled and made available online? This article, therefore, intends to answer this question with biases on Bantu languages. It will be necessary to make databases from the corpus constructs available by building corpora for the languages in question. The findings of this article are deemed important in offering knowledge on building of corpus and how to use the built corpus to investigate a syntactic feature in a Bantu language.
Read full abstract