Linguistic corpora of understudied languages: Do they make sense?

Igor Vinogradov

doi:10.15517/rk.v40i1.24143

Abstract

A corpus of an understudied language usually has documentary-linguistic nature and comprises all text material available in a particular language. However, without resorting to text selection, it is impossible to obtain a representative and balanced sample of language use. Lack of these two characteristics makes a corpus almost useless for any kind of quantitative research. Nevertheless, corpora of understudied languages comply with a wide range of language documentation objectives. Furthermore, they can serve as evidence of the existence of word forms or grammatical features in texts that meet specific search criteria. If such corpora have well-elaborated linguistic annotation, they can complement grammatical descriptions and dictionaries, standing out against common text collections due to their digital format. They are especially suitable for typological research, when one has to deal with a huge amount of data in different and unrelated languages.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Káñina	Publication Date: May 3, 2016
Citations: 3	License type: CC BY-NC-ND 4.0

R Discovery Prime

R Discovery Prime

Linguistic corpora of understudied languages: Do they make sense?

Abstract

Talk to us

Similar Papers

More From: Káñina

Lead the way for us

Similar Papers

Teaching & Learning Guide for: Corpus Linguistics in the UK: Resources for Sociolinguistic Research
Wendy Anderson
Language and Linguistics Compass | VOL. 3
Wendy AndersonWendy Anderson
01 Jan 2009
Language and Linguistics Compass | VOL. 3

GRAMMATICAL AND SYNTACTIC PECULIARITIES OF TRANSLATION OF ENGLISH MILITARY AND POLITICAL TEXTS INTO UKRAINIAN
Artur Gudmanian ... Khrystyna Halytska
Advanced Linguistics | VOL. -
Artur Gudmanian, et. al.Artur Gudmanian ... Khrystyna Halytska
30 Nov 2022
Advanced Linguistics | VOL. -

Lexikographische Lösungsansätze: Zur Bedeutung korpuslinguistischer Kontexualisierungsstrategien
Doris Höhmann
Glottotheory | VOL. 8
Doris HöhmannDoris Höhmann
26 Jan 2017
Glottotheory | VOL. 8

A new approach for textual feature selection based on N-composite isolated labels
Samir Elloumi
Natural Language Engineering | VOL. 26
Samir ElloumiSamir Elloumi
29 Apr 2019
Natural Language Engineering | VOL. 26

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Linguistic corpora of understudied languages: Do they make sense?

Abstract

Talk to us

Similar Papers

More From: Káñina