Setting Standards With Multiple-Choice Tests: A Preliminary Intended-User Evaluation of SmartStandardSet

Gavin T L Brown,Paul Denny,David L San Jose,Ellen Li

doi:10.3389/feduc.2021.735088

Abstract

Software that easily helps higher education instructors to remove poor quality items and set appropriate grade boundaries is generally lacking. To address these challenges, the SmartStandardSet system provides a graphical-user interface for removing defective items, weighting student scores using a two-parameter model IRT score analysis, and a mechanism for standard-setting. We evaluated the system through a series of six interviews with teachers and six focus groups involving 19 students to understand how key stakeholders would view the use of the tool in practice. Generally, both groups of participants reported high levels of feasibility, accuracy, and utility in SmartStandardSet’s statistical scoring of items and score calculation for test-takers. Teachers indicated the data displays would help them improve future test items; students indicated the system would be fairer and would motivate greater effort on more difficult test items. However, both groups had concerns about implementing the system without institutional policy endorsement. Students specifically were concerned that academics may set grade boundaries on arbitrary and invalid grounds. Our results provide useful insights into the perceived benefits of using the tool for standard setting, and suggest concrete next steps for gaining wider acceptance that will be the focus of future work.

Highlights

Grade boundaries for tests are usually related to the proportion of items answered correctly
We describe a software system, SmartStandardSet, that automates IRT analysis of multiple-choice question (MCQs) test items, calculates weighted scores for students, and allows for grade boundaries to be set according to standards-based judgements by higher education instructors
Instructors generally perceived that the scores created by SmartStandardSet were an accurate way of determining scores and understood how SmartStandardSet removed the guesswork in creating a credible statistically informed score

Summary

Introduction

Grade boundaries for tests are usually related to the proportion of items answered correctly This is potentially misleading because test difficulty or easiness is not considered (e.g., easy tests create high scores). There may be resistance from both students and lecturers in accepting IRT-based scoring in environments where it is not approved by policy This paper addresses both gaps by: 1) describing a newly developed prototype tool, SmartStandardSet, for performing test quality evaluation and standard setting, and 2) conducting an exploratory pilot evaluation from the perspective of intended stakeholders concerning the utility, feasibility, accuracy, and propriety of the system. This preliminary evaluation gauges the acceptance of a potentially major change in how multiple-choice tests are evaluated and prepared for grading; it is warranted and provides useful insights despite its small-scale

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Frontiers in Education	Publication Date: Aug 31, 2021
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Setting Standards With Multiple-Choice Tests: A Preliminary Intended-User Evaluation of SmartStandardSet

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Education

Lead the way for us

Similar Papers

“Don't Call Me a Student-Athlete”: The Effect of Identity Priming on Stereotype Threat for Academically Engaged African American College Athletes
Jeff Stone ... Javonte Mottley
Basic and Applied Social Psychology | VOL. 34
Jeff Stone, et. al.Jeff Stone ... Javonte Mottley
01 Mar 2012
Basic and Applied Social Psychology | VOL. 34

Text Processing Variables Predict the Readability of Everyday Documents Read by Older Adults
Bonnie J F Meyer ... Sherry L Willis
Reading Research Quarterly | VOL. 28
Bonnie J F Meyer, et. al.Bonnie J F Meyer ... Sherry L Willis
01 Jul 1993
Reading Research Quarterly | VOL. 28

PENGEMBANGAN DAN ANALISIS SOAL ULANGAN KENAIKAN KELAS KIMIA SMA KELAS X BERDASARKAN CLASSICAL TEST THEORY DAN ITEM RESPONSE THEORY
Mr Nahadi ... Mrs Wiwi Siswaningsih
Jurnal Pengajaran Matematika dan Ilmu Pengetahuan Alam | VOL. 16
Mr Nahadi, et. al.Mr Nahadi ... Mrs Wiwi Siswaningsih
01 Oct 2011
Jurnal Pengajaran Matematika dan Ilmu Pengetahuan Alam | VOL. 16

A Monte Carlo Comparison of Three Optimal Test Scoring Procedures
Steven Gorman
-
Steven GormanSteven Gorman
01 Jan 1981
01 Jan 1981

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Setting Standards With Multiple-Choice Tests: A Preliminary Intended-User Evaluation of SmartStandardSet

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Education