This corpus-based study explored the development of cohesive devices in the writing of Chinese beginner learners of English as a foreign language (EFL) over a three-year span. Quantitative analysis utilizing the Tool for Automatic Analysis of Cohesion (TAACO) was conducted on a longitudinal learner corpus comprising over 500 exam essays. Lexical, syntactic, semantic, and discourse features were examined to identify reliable indices for tracking learners’ progressive mastery of cohesion. Results revealed that pronoun-related features, including pronoun density and repetition, significantly differed across year pairs and robustly predicted writing development. However, most lexical and connective indices showed ambiguous trajectories over time. The findings highlight the vital role of pronouns in building coherence for novice writers and underscore persistent difficulties in acquiring sophisticated content words and their collocations. This study contributes data-driven insights into the nonlinear processes and enduring challenges shaping EFL beginners’ cohesive competence. It demonstrates the value of computational tools and learner corpora in exploring discourse acquisition.
Read full abstract