Automatic generation of physics items with Large Language Models (LLMs)

Moses Oluoke Omopekunola,Elena Yu Kardanova

doi:10.21831/reid.v10i2.76864

Abstract

High-quality items are essential for producing reliable and valid assessments, offering valuable insights for decision-making processes. As the demand for items with strong psychometric properties increases for both summative and formative assessments, automatic item generation (AIG) has gained prominence. Research highlights the potential of large language models (LLMs) in the AIG process, noting the positive impact of generative AI tools like ChatGPT on educational assessments, recognized for their ability to generate various item types across different languages and subjects. This study fills a research gap by exploring how AI-generated items in secondary/high school physics aligned with educational taxonomy. It utilizes Bloom's taxonomy, a well-known framework for designing and categorizing assessment items across various cognitive levels, from low to high. It focuses on a preliminary assessment of LLMs ability to generate physics items that match the Bloom’s taxonomy application level. Two leading LLMs, ChatGPT (GPT-4) and Gemini, were chosen for their strong performance in creating high-quality educational content. The research utilized various prompts to generate items at different cognitive levels based on Bloom's taxonomy. These items were assessed using multiple criteria: clarity, accuracy, absence of misleading content, appropriate complexity, correct language use, alignment with the intended level of Bloom's taxonomy, solvability, and assurance of a single correct answer. The findings indicated that both ChatGPT and Gemini were skilled at generating physics assessment items, though their effectiveness varied based on the prompting methods used. Instructional prompts, particularly, resulted in excellent outputs from both models, producing items that were clear, precise, and consistently aligned with the Application level of Bloom's taxonomy.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Automatic generation of physics items with Large Language Models (LLMs)

Abstract

Talk to us

Similar Papers

More From: REID (Research and Evaluation in Education)

Lead the way for us

Journal: REID (Research and Evaluation in Education)	Publication Date: Oct 12, 2024
License type: CC BY-SA 4.0

Similar Papers

Accuracy and reliability of large language models in assessing learning outcomes achievement across cognitive domains.
Swapna Haresh Teckwani ... Ivan Cherh Chiet Low
Advances in physiology education | VOL. 48
Swapna Haresh Teckwani, et. al.Swapna Haresh Teckwani ... Ivan Cherh Chiet Low
01 Dec 2024
Advances in physiology education | VOL. 48

Assessing ChatGPT's Mastery of Bloom's Taxonomy Using Psychosomatic Medicine Exam Questions: Mixed-Methods Study.
Anne Herrmann-Werner ... Friederike Holderried
Journal of Medical Internet Research | VOL. 26
Anne Herrmann-Werner, et. al.Anne Herrmann-Werner ... Friederike Holderried
23 Jan 2024
Journal of Medical Internet Research | VOL. 26

Performance of Large Language Models on a Neurology Board–Style Examination
Marc Cicero Schubert ... Varun Venkataramani
JAMA network open | VOL. 6
Marc Cicero Schubert, et. al.Marc Cicero Schubert ... Varun Venkataramani
07 Dec 2023
JAMA network open | VOL. 6

Analysis of Physiology Theory Question Papers for Competency-Based Medical Education Implementation in Gujarat: A Pilot Study.
Manish Ramavat ... Jitendra Patel
Cureus | VOL. 16
Manish Ramavat, et. al.Manish Ramavat ... Jitendra Patel
01 Jul 2024
Cureus | VOL. 16

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Automatic generation of physics items with Large Language Models (LLMs)

Abstract

Talk to us

Similar Papers

More From: REID (Research and Evaluation in Education)