Abstract

Abstract This paper introduces the open-source English Language Learning Insight, Proficiency and Skills Evaluation (ELLIPSE) corpus. The corpus comprises ~6,500 essays written by English language learners (ELLs). All essays were written during state-wide standardized annual testing in the United States. The essays were written on 29 different independent prompts that required no background knowledge on the part of the writer. Individual difference information is made available for each essay including economic status, gender, grade level (8–12), and race/ethnicity. Each essay was scored by two trained human raters for English language proficiency including an overall score of English proficiency and analytic scores for cohesion, syntax, vocabulary, phraseology, grammar, and conventions. The paper provides reliability on the human judgments of proficiency reported for the corpus. The ELLIPSE corpus addresses many of the concerns found in existing learner corpora including unique holistic and analytic scores for each ELL essay. The corpus also includes limited demographic and individual difference data for each ELL.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call