Abstract

Automated tools are widely used to assess syntactic complexity in second language (L2) writing studies; however, the effects of text length on syntactic complexity indices remain unclear. This can pose a challenge when studying underrepresented populations (e.g., young learners, adults with limited literacy skills), as their lower proficiency may result in less text production. To address this issue, we investigated the minimum text length threshold at which automated measures of syntactic complexity become the most reliable while considering L2 proficiency and prompt topic. Essays from the ICNALE corpus, a dataset of 5,200 essays with four proficiency levels, were used to create a dataset of texts of varying lengths (50, 100, 150, and 200 words). Mixed-effects regression models showed that seven out of 14 indices were not affected by text length regardless of learner proficiency and prompt topic. The other seven differed only between the 50- and 200-word texts within intermediate levels. We suggest a minimum of 100 words as a conservative threshold for the reliability of syntactic complexity indices. Finally, we emphasize the importance of transparent reporting practice regarding text length information.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call