Abstract

AbstractBackgroundStandardized measures of cognitive function are often used in research to understand predictors of heterogeneity in cognitive aging. Increasingly, researchers seek to pool data across studies. When assessment protocols contain overlapping items, item response theory and associated analytical techniques can be used to combine test items across studies into harmonized cognitive outcome(s). To better understand how characteristics of test item overlap impact a harmonized outcome, this study uses psychometric data from two cognitive aging studies to simulate and compare the accuracy of harmonization across six total items/overlapping items scenarios.MethodItem‐level cognitive data from 19 unique items across the Health and Retirement Study (HRS; 12 items) and the Mexican Health and Aging Study (MHAS; 10 items) were used (Table 1). Item response theory (IRT) calibration of a unidimensional model using data from the combined sample yielded item parameter and ability (theta) estimates that were then used to simulate six scenarios exploring item count and overlap combinations (19/19, 12/12, 10/10, 19/4, 19/3, and 19/1 for total/overlapping (akak “linking”) items, respectively). Briefly, simulation methods were: generate 500 simulated item‐level data sets with two groups (n = 500/group), each group having mean(sd) true ability set equal to HRS and MHAS estimates, respectively; apply IRT calibration to each simulated dataset, generating factor scores (using expected a posteriori (EAP) method); accuracy of harmonization was quantified by the difference between the IRT‐estimated factor scores and thetas for each simulated dataset using root mean square error (RMSE).ResultRMSE generally improved (decreased error) with increasing numbers of overlapping tests and linking items (Fig1). Differences between group means were poorly reproduced with <4 linking items (Fig2). The scenario representing the true overlap of three items between MHAS and HRS items (19/3) did not reproduce means(sd) well (suggesting that, without at least one additional linking item with desirable measurement properties (scenario 5), these tests may not be harmonized accurately.ConclusionFuture simulations should attempt to disentangle the effects of linking item quantity and characteristics (e.g. range of difficulty) from the effects of differences in group means and standard deviations on RMSE.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.