Abstract

PurposeDiagnostic classification models (DCMs) were developed to identify the mastery or non-mastery of the attributes required for solving test items, but their application has been limited to very low-level attributes, and the accuracy and consistency of high-level attributes using DCMs have rarely been reported compared with classical test theory (CTT) and item response theory models. This paper compared the accuracy of high-level attribute mastery between deterministic inputs, noisy “and” gate (DINA) and Rasch models, along with sub-scores based on CTT.MethodsFirst, a simulation study explored the effects of attribute length (number of items per attribute) and the correlations among attributes with respect to the accuracy of mastery. Second, a real-data study examined model and item fit and investigated the consistency of mastery for each attribute among the 3 models using the 2017 Korean Medical Licensing Examination with 360 items.ResultsAccuracy of mastery increased with a higher number of items measuring each attribute across all conditions. The DINA model was more accurate than the CTT and Rasch models for attributes with high correlations (>0.5) and few items. In the real-data analysis, the DINA and Rasch models generally showed better item fits and appropriate model fit. The consistency of mastery between the Rasch and DINA models ranged from 0.541 to 0.633 and the correlations of person attribute scores between the Rasch and DINA models ranged from 0.579 to 0.786.ConclusionAlthough all 3 models provide a mastery decision for each examinee, the individual mastery profile using the DINA model provides more accurate decisions for attributes with high correlations than the CTT and Rasch models. The DINA model can also be directly applied to tests with complex structures, unlike the CTT and Rasch models, and it provides different diagnostic information from the CTT and Rasch models.

Highlights

  • Editor: Sun Huh, Hallym University, KoreaReceived: June 3, 2021; Accepted: June 22, 2021Published: July 5, 2021This article is available from: http://jeehp.org

  • The requirements for informed consent and institutional review board approval were exempted according to the Enforcement Rule of Bioethics and Safety Act of Haberman and Sinharay [9] demonstrated the appropriateness of reporting sub-scores using multidimensional item response theory (MIRT) in large-scale assessments

  • The accuracy of mastery was slightly different between the Rasch and DINA models for all conditions, both models were better than the classical test theory (CTT) model

Read more

Summary

Objectives

This paper compared the accuracy and consistency of diagnostic skill reporting (students’ strengths and weaknesses in terms of mastery of content strands) between DINA and IRT/CTT models. In order to compare the sub-scores among 3 models, a simulation study was conducted to examine the effects of attribute size (number of items per attribute) and the correlations among the attributes. A real-data study was carried out using a large-scale assessment. The simulation explored the accuracy of mastery or non-mastery among the 3 models, while the real-data study examined the models’ consistency of determining mastery or non-mastery of strands

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call