Abstract Research has demonstrated that features of lexical and lexicogrammatical use are important predictors of productive second language (L2) proficiency (e.g. Kyle et al. 2018). While some features of lexical use have been studied with L2s other than English (e.g. Tracy-Ventura 2017), multivariate lexical and lexicogrammatical approaches in these L2s are rare. In this study, we extend the use of multivariate approaches to L2 Spanish writing. Our learner data included a subset of the CEDEL2 corpus (Lozano 2021), comprised of proficiency scores and 644 descriptive essays written in L2 Spanish by L1 English writers. Correlational analyses were conducted between proficiency scores and indices of lexical diversity (e.g. MTLD), mean word and bigram frequencies, and bigram strength of association (MI, delta). A final regression analysis accounted for 48.3 per cent of the variance in proficiency scores. Following previous L2 English writing research (e.g. Kyle et al. 2018; Monteiro et al. 2020), more proficient L2 Spanish writers tended to use a wider variety of lexical items, more strongly associated word combinations, and lexical items that are less frequent in corpora.
Read full abstract