The University of Pittsburgh English Language Institute Corpus (PELIC)

Ben Naismith,Alan Juffs,Na-Rae Han

doi:10.1075/ijlcr.21002.nai

Abstract

Abstract This report introduces the University of Pittsburgh English Language Institute Corpus (PELIC; Juffs et al., 2020), a publicly available 4.2-million-word learner corpus of written texts. Collected over seven years in the University of Pittsburgh’s Intensive English Program, these texts were produced by more than 1,100 students with diverse linguistic backgrounds and proficiency levels. Unlike most learner corpora which are cross-sectional, PELIC is longitudinal, offering greater opportunities for tracking development in a natural classroom setting. This potential is illustrated in an overview of the research conducted to date with these data. The report also provides a description of PELIC’s creation and contents, including how the texts have been managed to facilitate natural language processing. Overall, the corpus contributes to the field of learner corpus research by adding to the pool of freely and publicly available learner corpora, supplemented by a useful set of Python tools and tutorials for accessing these data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

The University of Pittsburgh English Language Institute Corpus (PELIC)

Abstract

Talk to us

Similar Papers

More From: International Journal of Learner Corpus Research

Lead the way for us

Journal: International Journal of Learner Corpus Research	Publication Date: Mar 8, 2022
Citations: 4

Similar Papers

Investigating the Efficacy of an Intensive English Program and the L2 Learners’ Learning Styles

Advances in Language and Literary Studies | VOL. 5

10 Nov 2014
Advances in Language and Literary Studies | VOL. 5

Bringing Pragmatics into the ESL Classroom
Tahnee Bucher Barbosa Da Silva
-
Tahnee Bucher Barbosa Da SilvaTahnee Bucher Barbosa Da Silva
01 Jan 2012
01 Jan 2012

Learner corpora and the design of data-driven learning activities
Luciana Forti
-
Luciana FortiLuciana Forti
15 Aug 2023
15 Aug 2023

Evaluation of level of foreign language proficiency based on eye movement data
V. A. Demareva ... Yu. A. Edeleva
Journal of Optical Technology | VOL. 89
V. A. Demareva, et. al.V. A. Demareva ... Yu. A. Edeleva
01 Aug 2022
Journal of Optical Technology | VOL. 89

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The University of Pittsburgh English Language Institute Corpus (PELIC)

Abstract

Talk to us

Similar Papers

More From: International Journal of Learner Corpus Research