Abstract
In this study, person parameter recoveries are investigated by retrofitting polytomous attribute cognitive diagnosis and multidimensional item response theory (MIRT) models. The data are generated using two cognitive diagnosis models (i.e., pG-DINA: the polytomous generalized deterministic inputs, noisy “and” gate and fA-M: the fully-additive model) and one MIRT model (i.e., the compensatory two-parameter logistic model). Twenty-five replications are used for each of the 54 conditions resulting from varying the item discrimination index, ratio of simple to complex items, test length, and correlations between skills. The findings are obtained by comparing the person parameter estimates of all three models to the actual parameters used in the data generation. According to the findings, the most accurate estimates are obtained when the fitted models correspond to the generating models. Comparable results are obtained when the fA-M is retrofitted to other data or when the MIRT model is retrofitted to fA-M data. However, the results are poor when the pG-DINA is retrofitted to other data or the MIRT is retrofitted to pG-DINA data. Among the conditions used in the study, test length and item discrimination have the greatest influence on the person parameter estimation accuracy. Variation in the simple to complex item ratio has a notable influence when the MIRT model is used. Although the impact on the person parameter estimation accuracy of the correlation between skills is limited, its effect on MIRT data is more significant.
Highlights
Some of the specific measurement procedures used in education and psychology can be applied to one or more attributes
We addressed this through the following sub-problems: 1- What levels of accuracy can be obtained for the person parameter classification and ability level estimation from the two Cognitive Diagnosis Model (CDM) and one multidimensional item response theory (IRT) (MIRT) model when they are fitted to the MIRT data generated under various item discrimination, item structure, correlation between skills, and test length conditions? Is there a difference between the person parameter estimation accuracy levels of the models?
These findings suggest that the correct vector classification rates (CVCR) of the MIRT analyses and the fully additive model (fA-M) retrofitting results were comparable, which were different from the CVCRs of the pGDINA analyses
Summary
Some of the specific measurement procedures used in education and psychology can be applied to one or more attributes. Scales constructed to measure a single skill may be applied to another, but high correlations between the skills measured may render the scale insensitive to measuring other skills (Reckase, 2007). Tests may appear to measure only one main skill. If the correlations between measured skills are not too high, the main factor may not suppress other factors, in psychological-based measurements. Multiple skills may be measured intentionally or unintentionally. Various psychometric approaches can be taken when measuring multiple skills. In item response theory (IRT), unidimensional IRT (UIRT) models can be applied multiple times to measure one skill at a time, whereas multidimensional IRT (MIRT) models can be used to measure more than one skill simultaneously
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.