Abstract

Abstract In the last few decades, much work in corpus linguistics has attempted to discover, and then interpret, differences in the frequencies of use of linguistic elements (words, patterns, constructions, discourse features, etc.). It is probably fair to say that such studies were particularly frequent in (i) learner corpus research, (ii) corpus-based varieties research, and (iii) sociolinguistically motivated studies. For instance, many studies have discussed the differences in how often certain elements are used (i) in corpus data from native speakers vs. corpus data from learner from different L1 backgrounds, (ii) in corpora representing different inner- and outer-circle varieties, or (iii) by speakers in corpora representing people of different gender or sexual identities. This paper will make the admittedly bold claim that any such study can in fact by definition unable to ‘prove’ what is often their main points, namely that the distributional differences found are in fact due to the one hypothesized explanatory variable(s) of L1, VARIETY, or, e.g., GENDER even when the distributional differences are significant and come with a decent effect size. To substantiate this claim, I will discuss some terminology from the family of methods known as multi-level modeling, namely the distinction between level-1, level-2, ... level-n variables and its relevance for many corpus studies. Second, I will then demonstrate how studies using only the above kinds of variables cannot distinguish the effect of their favored predictors from the effect of local/contextual level-1 variables. Third, in discussing this, I will exemplify how such effects need to be explored quantitatively instead.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.