Abstract
This paper describes the collection of a semi-spontaneous spoken corpus of learners of Spanish from more than nine different mother tongues. The corpus was tagged to conduct a computer-aided error analysis. In addition, the corpus was used to develop a computer-based tool that has practical pedagogical applications (e.g., to train teachers of Spanish). The interface is available online to allow teachers and linguists to consult the data. This paper explains the methodology I followed to gather the data. First, I consider the data collection method and the corpus design. Secondly, the transcription conventions and the XML tags used to code learners' metadata and their errors. Thirdly, the article explains the criteria used to mark oral production errors and the error typology. I then consider the design, development and evaluation of the corpus search tool. Lastly, some pedagogical applications are put forward. The conclusions and limitations of the project are outlined in the final section.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have