Abstract
Automated tools are widely used to assess syntactic complexity in second language (L2) writing studies; however, the effects of text length on syntactic complexity indices remain unclear. This can pose a challenge when studying underrepresented populations (e.g., young learners, adults with limited literacy skills), as their lower proficiency may result in less text production. To address this issue, we investigated the minimum text length threshold at which automated measures of syntactic complexity become the most reliable while considering L2 proficiency and prompt topic. Essays from the ICNALE corpus, a dataset of 5,200 essays with four proficiency levels, were used to create a dataset of texts of varying lengths (50, 100, 150, and 200 words). Mixed-effects regression models showed that seven out of 14 indices were not affected by text length regardless of learner proficiency and prompt topic. The other seven differed only between the 50- and 200-word texts within intermediate levels. We suggest a minimum of 100 words as a conservative threshold for the reliability of syntactic complexity indices. Finally, we emphasize the importance of transparent reporting practice regarding text length information.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.