Abstract

Item Response Theory becomes an increasingly important tool when analyzing ``Big Data'' gathered from online educational venues. However, the mechanism was originally developed in traditional exam settings, and several of its assumptions are infringed upon when deployed in the online realm. For a large enrollment physics course for scientists and engineers, the study compares outcomes from IRT analyses of exam and homework data, and then proceeds to investigate the effects of each confounding factor introduced in the online realm. It is found that IRT yields the correct trends for learner ability and meaningful item parameters, yet overall agreement with exam data is moderate. It is also found that learner ability and item discrimination is over wide ranges robust with respect to model assumptions and introduced noise, less so than item difficulty.

Highlights

  • Item response theory (IRT) offers the opportunity for data-driven development and evaluation of assessment items, such as homework, practice, concept inventory, and exam problems

  • For IRT to be reliable, there are a number of implicit assumptions, which are usually fulfilled in standard exam settings

  • To investigate the influence of violating some of the basic assumptions of IRT, this study considers a largeenrollment (256 student) physics course for scientists and engineers

Read more

Summary

INTRODUCTION

Item response theory (IRT) offers the opportunity for data-driven development and evaluation of assessment items, such as homework, practice, concept inventory, and exam problems. One learner’s inherent constant ability may result in different scores on different sets of test items, depending on how difficult, well written, meaningful, or representative these items are. Scenarios related to copying and guessing result in the learner’s score on an item not reflecting that particular learner’s own best-effort ability. Measurement perspective make scores less meaningful: even low-ability students might eventually get the items correct. Data from online homework could be the key to helping these learners—but how reliable is IRT based on online formative assessment data for the determination of learner ability?.

The course
IRT model
Item characteristics
Learner ability
Thresholds for number of attempts
Log-likelihood
Item difficulty
Item discrimination
Partial credit for multiple attempts
Discussion of multiple attempts
Three parameter model
Earlier parts of the semester
Modeling guessing and copying
Findings
CONCLUSIONS
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call