- Research Article
- 10.1177/00131644251380777
- Oct 23, 2025
- Educational and psychological measurement
- Larry V Hedges
Researchers conducting systematic reviews and meta-analyses often encounter studies in which the research design is a well conducted cluster randomized trial, but the statistical analysis does not take clustering into account. For example, the study might assign treatments by clusters but the analysis may not take into account the clustered treatment assignment. Alternatively, the analysis of the primary outcome of the study might take clustering into account, but the reviewer might be interested in another outcome for which only summary data are available in a form that does not take clustering into account. This article provides expressions for the approximate variance of risk differences, log risk ratios, and log odds ratios computed from clustered binary data, using the intraclass correlations. An example illustrates the calculations. References to empirical estimates of intraclass correlations are provided.
- Research Article
- 10.1177/00131644251371187
- Oct 23, 2025
- Educational and psychological measurement
- Ludovica De Carolis + 1 more
This study compares two network-based approaches for analyzing binary psychological assessment data: network psychometrics and latent space item response modeling (LSIRM). Network psychometrics, a well-established method, infers relationships among items or symptoms based on pairwise conditional dependencies. In contrast, LSIRM is a more recent framework that represents item responses as a bipartite network of respondents and items embedded in a latent metric space, where the likelihood of a response decreases with increasing distance between the respondent and item. We evaluate the performance of both methods through simulation studies under varying data-generating conditions. In addition, we demonstrate their applications to real assessment data, showcasing the distinct insights each method offers to researchers and practitioners.
- Research Article
- 10.1177/00131644251374302
- Oct 7, 2025
- Educational and psychological measurement
- Georgios D Sideridis + 1 more
The three-parameter logistic (3PL) model in item-response theory (IRT) has long been used to account for guessing in multiple-choice assessments through a fixed item-level parameter. However, this approach treats guessing as a property of the test item rather than the individual, potentially misrepresenting the cognitive processes underlying the examinee's behavior. This study evaluates a novel alternative, the Two-Parameter Logistic Extension (2PLE) model, which re-conceptualizes guessing as a function of a person's ability rather than as an item-specific constant. Using Monte Carlo simulation and empirical data from the PIRLS 2021 reading comprehension assessment, we compared the 3PL and 2PLE models on the recovery of latent ability, predictive fit (Leave-One-Out Information Criterion [LOOIC]), and theoretical alignment with test-taking behavior. The simulation results demonstrated that although both models performed similarly in terms of root-mean-squared error (RMSE) for ability estimates, the 2PLE model consistently achieved superior LOOIC values across conditions, particularly with longer tests and larger sample sizes. In an empirical analysis involving the reading achievement of 131 fourth-grade students from Saudi Arabia, model comparison again favored 2PLE, with a statistically significant LOOIC difference (ΔLOOIC = 0.482, z = 2.54). Importantly, person-level guessing estimates derived from the 2PLE model were significantly associated with established person-fit statistics (C*, U3), supporting their criterion validity. These findings suggest that the 2PLE model provides a more cognitively plausible and statistically robust representation of examinee behavior by embedding an ability-dependent guessing function.
- Research Article
1
- 10.1177/00131644251369532
- Oct 4, 2025
- Educational and psychological measurement
- Yun-Kyung Kim + 2 more
In item response theory modeling, item fit analysis using posterior expectations, otherwise known as pseudocounts, has many advantages. They are readily obtained from the E-step output of the Bock-Aitkin Expectation-Maximization (EM) algorithm and continue to function as a basis of evaluating model fit, even when missing data are present. This paper aimed to improve the interpretability of the root mean squared deviation (RMSD) index based on posterior expectations. In Study 1, we assessed its performance using two approaches. First, we employed the poor person's posterior predictive model checking (PP-PPMC) to compute their significance levels. The resulting Type I error was generally controlled below the nominal level, but power noticeably declined with smaller sample sizes and shorter test lengths. Second, we used receiver operating characteristic (ROC) curve analysis (±) to empirically determine the reference values (cutoff thresholds) that achieve an optimal balance between false-positive and true-positive rates. Importantly, we identified optimal reference values for each combination of sample size and test length in the simulation conditions. The cutoff threshold approach outperformed the PP-PPMC approach with greater gains in true-positive rates than losses from the inflated false-positive rates. In Study 2, we extended the cutoff threshold approach to conditions with larger sample sizes and longer test lengths. Moreover, we evaluated the performance of the optimized cutoff thresholds under varying levels of data missingness. Finally, we employed response surface analysis (±) to develop a prediction model that generalizes the way the reference values vary with sample size and test length. Overall, this study demonstrates the application of the PP-PPMC for item fit diagnostics and implements a practical frequentist approach to empirically derive reference values. Using our prediction model, practitioners can compute the reference values of RMSD that are tailored to their dataset's sample size and test length.
- Book Chapter
- 10.4324/9781003439769-18
- Oct 2, 2025
- Educational and Psychological Measurement
- W Holmes Finch + 3 more
- Book Chapter
- 10.4324/9781003439769-10
- Oct 2, 2025
- Educational and Psychological Measurement
- W Holmes Finch + 3 more
- Book Chapter
- 10.4324/9781003439769-12
- Oct 2, 2025
- Educational and Psychological Measurement
- W Holmes Finch + 3 more
- Book Chapter
- 10.4324/9781003439769-5
- Oct 2, 2025
- Educational and Psychological Measurement
- W Holmes Finch + 3 more
- Book Chapter
- 10.4324/9781003439769-14
- Oct 2, 2025
- Educational and Psychological Measurement
- W Holmes Finch + 3 more
- Book Chapter
- 10.4324/9781003439769-1
- Oct 2, 2025
- Educational and Psychological Measurement
- W Holmes Finch + 3 more