Abstract

The present paper, which is a continuation of Tammekänd and Torn-Leesik’s (2022) study, aims to examine how learner errors affect the CLAWS7 tagger’s automated assignment of part-of-speech (POS) tags to a sample of 24,812 words of the Tartu Corpus of Estonian Learner English (TCELE). Learner errors causing tagging errors in the sample were identified, based on which a working error taxonomy was created. The POS-tagged and error-tagged samples were collated and compared to map correlations between learner and tagging errors. Error groups that correlated with significantly increased rates of tagging errors were identified. Possible reasons were suggested to account for the impact of learner errors on the tagger’s performance. The CLAWS7 tagger misanalysed only 2.8% of forms representing learners’ language errors but assigned wrong tags to every fifth spelling error (22%).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call