Abstract

Vocabulary learning is a typical part of nearly any second language learning curriculum. This entails methodologies and materials for training and testing vocabulary knowledge in learners. In large-scale programs, the preparation of such materials can be labor intensive and thus automatic means of generation are desirable. VocaTT (Vocabulary Training and Testing) is an ongoing project to use machine learning methods to generate novel multiple choice cloze (i.e., fill-in-the-blank) items for use in second language learning programs. This paper describes the ongoing creation of a gold standard set of multiple-choice cloze items to be used in training a machine learning algorithm. Machine-generated multiple choice cloze items were reviewed by two experienced language teachers, who evaluated each item for well-formedness (i.e., suitability as multiple-choice cloze test item) with three options: reject as unsalvageable, keep as-is, or revise into a well-formed item as they thought best. Results for a 600-item set that both checkers evaluated show moderate agreement on the question of rejection but slight agreement for keeping as-is. For revised items, the agreement on what type of revisions to make was slight to fair. In an expanded set of 2,792 items, checkers judged most items as needing revision but made varying kinds of revisions to yield well-formed items. Interested researchers may contact the authors to inquire about how they may access and use the evaluation dataset.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.