Using Machine Learning to Score Multi-Dimensional Assessments of Chemistry and Physics

Sarah Maestrales,Barbara Schneider,Quinton Baker,Israel Touitou,Xiaoming Zhai,Joseph Krajcik

doi:10.1007/s10956-020-09895-9

Abstract

In response to the call for promoting three-dimensional science learning (NRC, 2012), researchers argue for developing assessment items that go beyond rote memorization tasks to ones that require deeper understanding and the use of reasoning that can improve science literacy. Such assessment items are usually performance-based constructed responses and need technology involvement to ease the burden of scoring placed on teachers. This study responds to this call by examining the use and accuracy of a machine learning text analysis protocol as an alternative to human scoring of constructed response items. The items we employed represent multiple dimensions of science learning as articulated in the 2012 NRC report. Using a sample of over 26,000 constructed responses taken by 6700 students in chemistry and physics, we trained human raters and compiled a robust training set to develop machine algorithmic models and cross-validate the machine scores. Results show that human raters yielded good (Cohen’s k = .40–.75) to excellent (Cohen’s k > .75) interrater reliability on the assessment items with varied numbers of dimensions. A comparison reveals that the machine scoring algorithms achieved comparable scoring accuracy to human raters on these same items. Results also show that responses with formal vocabulary (e.g., velocity) were likely to yield lower machine-human agreements, which may be associated with the fact that fewer students employed formal phrases compared with the informal alternatives.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Using Machine Learning to Score Multi-Dimensional Assessments of Chemistry and Physics

Abstract

Talk to us

Similar Papers

More From: Journal of Science Education and Technology

Lead the way for us

Journal: Journal of Science Education and Technology	Publication Date: Mar 26, 2021
Citations: 34

Similar Papers

Inter-rater and intra-rater reliability of a clinical protocol for measuring turnout in collegiate dancers
Amanda Greene ... Kenneth Johnson
Physiotherapy Theory and Practice | VOL. 35
Amanda Greene, et. al.Amanda Greene ... Kenneth Johnson
02 Feb 2018
Physiotherapy Theory and Practice | VOL. 35

Inter-rater and Intra-rater Reliability of the Chinese Version of the Action Research Arm Test in People With Stroke
Jiang-Li Zhao ... Dong-Feng Huang
Frontiers in Neurology | VOL. 10
Jiang-Li Zhao, et. al.Jiang-Li Zhao ... Dong-Feng Huang
29 May 2019
Frontiers in Neurology | VOL. 10

Diagnostic tests to assess balance in patients with spinal cord injury: a systematic review of their validity and reliability.
Aatik Arsh ... Syed Shakil-Ur-Rehman
Asian biomedicine : research, reviews and news | VOL. 15
Aatik Arsh, et. al.Aatik Arsh ... Syed Shakil-Ur-Rehman
01 Jun 2021
Asian biomedicine : research, reviews and news | VOL. 15

Reliability of one-repetition maximum performance in people with chronic heart failure
Rachel Ellis ... Nora Shields
Disability and Rehabilitation | VOL. 41
Rachel Ellis, et. al.Rachel Ellis ... Nora Shields
24 Feb 2018
Disability and Rehabilitation | VOL. 41

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Using Machine Learning to Score Multi-Dimensional Assessments of Chemistry and Physics

Abstract

Talk to us

Similar Papers

More From: Journal of Science Education and Technology