Grounding lexical diversity in human judgments

Scott Jarvis

doi:10.1177/0265532217710632

Abstract

The present study discusses the relevance of measures of lexical diversity (LD) to the assessment of learner corpora. It also argues that existing measures of LD, many of which have become specialized for use with language corpora, are fundamentally measures of lexical repetition, are based on an etic perspective of language, and lack construct validity. The proposed solution draws from Zipf’s (1935) emic perspective of language, which views LD as a matter of perception, but which also assumes that competent speakers of a common language share similar perceptions. The present study tests whether this is true and specifically whether untrained human raters will show high levels of inter-rater reliability in their judgments of the levels of LD found in 60 texts extracted from a corpus of narratives written in English by a mix of language learners and native speakers. The results confirm Zipf’s assertion, but also indicate that a relatively large number of motivated raters are needed to demonstrate this tendency. The remainder of the study discusses the implications these results have for the development of an automated measure of LD to be used with learner corpora. The proposed method begins with human judgments of a representative subsample of a corpus, proceeds to a statistical model of objective measures that accurately predicts the human judgments, and ends with a multidimensional, corpus-specific automated measure that outputs reliable estimates of how a reliable group of human judges would rate the levels of LD in the texts of that corpus.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Grounding lexical diversity in human judgments

Abstract

Talk to us

Similar Papers

More From: Language Testing

Lead the way for us

Journal: Language Testing	Publication Date: Sep 19, 2017
Citations: 26

Similar Papers

Investigating whether a flemma count is a more distinctive measurement of lexical diversity
Thwin Myint Myint Maw ... George Higginbotham
Assessing Writing | VOL. 53
Thwin Myint Myint Maw, et. al.Thwin Myint Myint Maw ... George Higginbotham
01 Jul 2022
Assessing Writing | VOL. 53

Short texts, best-fitting curves and new measures of lexical diversity
Scott Jarvis
Language Testing | VOL. 19
Scott JarvisScott Jarvis
01 Jan 2002
Language Testing | VOL. 19

How operationalizations of word types affect measures of lexical diversity
Scott Jarvis ... Brett James Hashimoto
International Journal of Learner Corpus Research | VOL. 7
Scott Jarvis, et. al.Scott Jarvis ... Brett James Hashimoto
01 Mar 2021
International Journal of Learner Corpus Research | VOL. 7

Characterizing cognitive performance in a large longitudinal study of aging with computerized semantic indices of verbal fluency
Serguei V.S Pakhomov ... David Knopman
Neuropsychologia | VOL. 89
Serguei V.S Pakhomov, et. al.Serguei V.S Pakhomov ... David Knopman
28 May 2016
Neuropsychologia | VOL. 89

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Grounding lexical diversity in human judgments

Abstract

Talk to us

Similar Papers

More From: Language Testing