Abstract

Lexical complexity, generally understood as a multidimensional construct consisting of lexical density, sophistication, and diversity, has been recognized as an important construct in first language acquisition and second language acquisition. A large variety of lexical complexity measures have been proposed by researchers to study its relationship to second language learners’ writing and/or speaking proficiency. While this line of research has generated fruitful results for languages such as English and other Indo-European languages, less is known about how it may be applicable to a typologically distance language such as Chinese. In this paper, we report the design of a computational tool for automating the measurement of lexical complexity with 25 indices. The Batch Mode of the software supports analyses of a large number of .txt files and outputs results to a .CSV file suitable for importation into statistical packages for further analyses. In an example application, we analyzed 87 texts from three distinct registers (academic prose, fiction, and press reportage) from the Lancaster Corpus of Mandarin Chinese (McEnery and Xiao, The Lancaster Corpus of Mandarin Chinese: A corpus for monolingual and contrastive language study, in Proc. Fourth Int. Conf. Language Resources and Evaluation, eds. M. Lino, M. Xavier, F. Ferreira, R. Costa and R. Silva, (European Language Resources Association, Paris, 2014), pp. 1175–1178) and explored how register variation may manifest in lexical complexity. The ANOVA analyses showed linear increase in multiple lexical sophistication and diversity measures across academic prose, fiction and press reportage. The results showed that press reportage has lower lexical density, but higher lexical sophistication and diversity than academic prose; fiction also has higher lexical diversity than academic prose. This paper concluded with discussions of potential pedagogical applications for Chinese language teaching and learning.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call