Generation of Multiple-Choice Questions From Textbook Contents of School-Level Subjects

Dhawaleswar Rao Ch,Sujan Kumar Saha

doi:10.1109/tlt.2022.3224232

Abstract

Multiple-choice question (MCQ) plays a significant role in educational assessment. Automatic MCQ generation has been an active research area for years, and many systems have been developed for MCQ generation. Still, we could not find any system that generates accurate MCQs from school-level textbook contents that are useful in real examinations. This observation motivated us to develop a system that generates MCQs to assess the student's recall of factual information. Also, the available systems are often domain, subject, or application-specific in nature. Although the MCQ generation task demands a specific setup, we expect a level of generalization can be achieved. In this development, we also focus on this issue. We propose a pipeline for automatic generation of MCQs from textbooks of middle-school level subjects, and the pipeline is partially subject-independent. The proposed pipeline comprises four core modules: preprocessing, sentence selection, key selection, and distractor generation. Several techniques have been employed to implement individual modules. These include sentence simplification, syntactic and semantic processing of the sentences, entity recognition, semantic relationship extraction among entities, WordNet, neural word embedding, neural sentence embedding, and computation of intersentence similarity. The system is evaluated using the National Council of Educational Research and Training (NCERT), India, textbooks for three subjects. The quality of system-generated questions is assessed by human experts using various system-level and individual module-level metrics. The experimental results demonstrate that the proposed system is capable of generating quality questions that could be useful in a real examination.

Full Text