Abstract

Most measures of lexical richness in spontaneous speech data, based on the distribution of, or the relation between the types and tokens, appear to be neither reliable nor valid. The article describes a semi-automatic computer program, MLR (Measure of Lexical Richness) that measures lexical richness on the basis of the degree of difficulty of the words used, as measured by their (levels of) frequency in daily language input. The MLR is meant for the analysis of texts of (students in) primary education, with a vocabulary size of up to about 25,000 different lemmas, and provides an answer to the following questions: 1) What is the difficulty of the various words in the text? 2) What is the relative proportion of the degrees of difficulty of words in the text? 3) What is the covering percentage of the text for a student with a certain vocabulary size? 4) What is the size of vocabulary of the student, on the basis of the spontaneous speech data?

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call