A Robust Indicator Mean-Based Method for Estimating Generalizability Theory Absolute Error and Related Dependability Indices within Structural Equation Modeling Frameworks
In this study, we introduce a novel and robust approach for computing Generalizability Theory (GT) absolute error and related dependability indices using indicator intercepts that represent observed means within structural equation models (SEMs). We demonstrate the applicability of our method using one-, two-, and three-facet designs with self-report measures having varying numbers of scale points. Results for the indicator mean-based method align well with those obtained from the GENOVA and R gtheory packages for doing conventional GT analyses and improve upon previously suggested methods for deriving absolute error and corresponding dependability indices from SEMs when analyzing three-facet designs. We further extend our approach to derive Monte Carlo confidence intervals for all key indices and to incorporate estimation procedures that correct for scale coarseness effects commonly observed when analyzing binary or ordinal data.
- Research Article
34
- 10.1007/s10519-008-9247-7
- Dec 14, 2008
- Behavior Genetics
Following the publication of Purcell's approach to the modeling of gene by environment interaction in 2002, the interest in G x E modeling in twin and family data increased dramatically. The analytic techniques described by Purcell were designed for use with continuous data. Here we explore the re-parameterization of these models for use with ordinal and binary outcome data. Analysis of binary and ordinal data within the context of a liability threshold model traditionally requires constraining the total variance to unity to ensure identification. Here, we demonstrate an alternative approach for use with ordinal data, in which the values of the first two thresholds are fixed, thus allowing the total variance to change as function of the moderator. We also demonstrate that when using binary data, constraining the total variance to unity for a given value of the moderator is sufficient to ensure identification. Simulation results indicate that analyses of ordinal and binary data can recover both the raw and standardized patterns of results. However, the scale of the results is dependent on the specification of (threshold or variance) constraints rather than the underlying distribution of liability. Example Mx scripts are provided.
- Front Matter
24
- 10.1097/00000542-200203000-00002
- Mar 1, 2002
- Anesthesiology
Measurement of pain in children: state-of-the-art considerations.
- Research Article
75
- 10.1027/2698-1866/a000034
- Feb 1, 2023
- Psychological Test Adaptation and Development
The importance of providing structural validity evidence for test score(s) derived from psychometric test instruments is highlighted by several institutions; for example, the American Psychological Association (2014) demands that evidence for the validity of an instruments' internal structure and its underlying measurement model must be provided before it is applied in psychological assessment. The knowledge about the latent structure of data obtained with tests addressing the major question "What is/are the construct[s] being measured" by psychological tests under investigation (Ziegler, 2014 (Ziegler, , 2020)) . The study of structural validity is typically addressed with factor analyses when the test scores reflect continuous latent traits. As most submissions to Psychological Test Adaptation and Development (PTAD) deal with the adaptation and further development of existing measures, authors typically test a measurement model that is based on theoretical considerations and prior findings on original versions (or adaptations) of the test under investigation. Our literature review of PTAD's publications showed that more than 90% of the articles contain at least one confirmatory factor analysis (CFA). As editor and reviewers of PTAD, we appreciate that authors are rigorous in providing evidence on the structural validity of their tests' data. However, since PTAD's inception in 2019, we experience that one comment is frequently communicated to authors during the review process, namely, the request to adjust the analytic approach in CFA from maximum likelihood (ML) estimation toward using the mean-and variance-adjusted weighted least squares (WLSMV; Muthén et al., 1997) estimator to account for the ordinal nature of the data that psychological instruments typically generate on the item level. In this editorial, we discuss the rationale behind choosing the WLSMV estimator when analyzing test adaptations and developments that are based on ordinal categorical data and concisely illustrate the problems associated with using the ML estimator (potentially in combination with robust tests of model fit) for such data.
- Research Article
- 10.12973/ijem.11.3.423
- Aug 9, 2025
- International Journal of Educational Methodology
Educational researchers, as well as researchers in other disciplines, often work with ordinal data, such as Likert item responses and test item scores. Critical questions arise when researchers attempt to implement statistical models to analyse ordinal data, given that many statistical techniques assume the data analysed to be continuous. Could ordinal data be treated as continuous data, that is, assuming the ordinal data to be continuous and then applying statistical techniques as if analysing continuous data? Why and why not? Focusing on structural equation models (SEMs), particularly confirmatory factor analysis (CFA), this article discusses an ongoing debate on the treatment of ordinal data and reports a short review on the practices of conducting and reporting SEMs, in the context of mathematics education research. The author reviewed 70 publications in mathematics education research that reported a study involving SEMs to analyse ordinal data, but less than half discussed how data were treated or guided readers through the analysis; it is therefore harder to repeat such an analysis and evaluate the results. This article invites methodological discussions on SEMs with ordinal variables in the practices of educational research. Subsequently, a standard for reporting SEMs with ordinal data is proposed, followed by an example. This standard contributes to educational research by enabling researchers (self and others) to evaluate SEMs reported. The example demonstrates, using real-life research data, how two different approaches for analysing ordinal data (as continuous or as a product of discretisation from some continuous distributions) can lead to results that disagree.
- Research Article
9
- 10.1111/j.2042-3306.2011.00414.x
- Jun 2, 2011
- Equine Veterinary Journal
Clinical studies utilising ordinal data: Pitfalls in the analysis and interpretation of clinical grading systems
- Research Article
4
- 10.1080/10705511.2016.1227260
- Sep 16, 2016
- Structural Equation Modeling: A Multidisciplinary Journal
This study evaluates latent differential equation models on binary and ordinal data. Binary and ordinal data are widely used in psychology research and many statistical models have been developed, such as the probit model and the logit model. We combine the latent differential equation model with the probit model through a threshold approach, and then compare the threshold model with a naive model, which blindly treats binary and ordinal data as continuous. Simulation results suggest that the naive model leads to bias on binary data and on ordinal data with fewer than 5 levels, whereas the threshold model is unbiased and efficient for binary and ordinal data. Two example analyses on empirical binary data and ordinal data show that the threshold model also has better external validity. The R code for the threshold model is provided.
- Research Article
13
- 10.1002/(sici)1098-2272(1996)13:1<79::aid-gepi7>3.0.co;2-1
- Jan 1, 1996
- Genetic Epidemiology
The univariate analysis of categorical twin data can be performed using either structural equation modeling (SEM) or logistic regression. This paper presents a comparison between these two methods using a simulation study. Dichotomous and ordinal (three category) twin data are simulated under two different sample sizes (1,000 and 2,000 twin pairs) and according to different additive genetic and common environmental models of phenotypic variation. The two methods are found to be generally comparable in their ability to detect a "correct" model under the specifications of the simulation. Both methods lack power to detect the right model for dichotomous data when the additive genetic effect is low (between 10 and 20%) or medium (between 30 and 40%); the ordinal data simulations produce similar results except for the additive genetic model with medium or high heritability. Neither method could adequately detect a correct model that included a modest common environmental effect (20%) even when the additive genetic effect was large and the sample size included 2,000 twin pairs. The SEM method was found to have better power than logistic regression when there is a medium (30%) or high (50%) additive genetic effect and a modest common environmental effect. Conversely, logistic regression performed better than SEM in correctly detecting additive genetic effects with simulated ordinal data (for both 1,000 and 2,000 pairs) that did not contain modest common environmental effects; in this case the SEM method incorrectly detected a common environmental effect that was not present.
- Research Article
8
- 10.1080/10705511.2023.2222913
- Aug 9, 2023
- Structural Equation Modeling: A Multidisciplinary Journal
We demonstrate how to analyze complete multivariate generalizability theory (GT) designs within structural equation modeling frameworks that encompass both individual subscale scores and composites formed from those scores. Results from numerous analyses of observed scores obtained from respondents who completed the recently updated form of the Big Five Inventory (BFI-2) revealed that the lavaan SEM package in R produced results virtually identical to those obtained from the mGENOVA package, which historically has served as the gold standard for conducting multivariate GT analyses. We further extended lavaan analyses beyond what mGENOVA allows to produce Monte Carlo based confidence intervals for key GT parameters and correct score consistency and correlational indices for effects of scale coarseness characteristic of binary and ordinal data. Our comprehensive online Supplemental Material includes code for performing all illustrated analyses using lavaan and mGENOVA.
- Research Article
25
- 10.1002/(sici)1097-0258(19990228)18:4<385::aid-sim25>3.0.co;2-1
- Feb 28, 1999
- Statistics in Medicine
We explore structural equations with latent variables for modelling between-individual variability and measurement error in the analysis of longitudinal binary and ordinal data. The structural equation formulation provides insight into the assumptions and differences in interpretation of methods that are popular for longitudinal data analysis. Introducing the concept of continuous latent variables makes it clear that marginal and cluster-specific models differ because their predicted variables are scaled to different standard deviations, and that adjustment for measurement error in the outcome involves a change in scale as well. We apply both structural equation modelling and common longitudinal modelling approaches to data from a study of sleep disorders. In the process, we compare results from marginal modelling using an SAS GEE routine (Karim and Zeger, 1988), Qu's GAUSS program (Qu, 1992) for generalized mixed models using GEE, the MIXOR package for cluster-specific mixed effects models (Hedeker and Gibbons, 1994), and LISCOMP for structural models (Muthén, 1988).
- Research Article
86
- 10.1348/000711010x497442
- Jan 25, 2011
- British Journal of Mathematical and Statistical Psychology
This paper develops a ridge procedure for structural equation modelling (SEM) with ordinal and continuous data by modelling the polychoric/polyserial/product-moment correlation matrix R. Rather than directly fitting R, the procedure fits a structural model to R(a) =R+aI by minimizing the normal distribution-based discrepancy function, where a > 0. Statistical properties of the parameter estimates are obtained. Four statistics for overall model evaluation are proposed. Empirical results indicate that the ridge procedure for SEM with ordinal data has better convergence rate, smaller bias, smaller mean square error, and better overall model evaluation than the widely used maximum likelihood procedure.
- Research Article
11
- 10.3390/psych5020019
- Apr 13, 2023
- Psych
Generalizability theory provides a comprehensive framework for determining how multiple sources of measurement error affect scores from psychological assessments and using that information to improve those assessments. Although generalizability theory designs have traditionally been analyzed using analyses of variance (ANOVA) procedures, the same analyses can be replicated and extended using structural equation models. We collected multi-occasion data from inventories measuring numerous dimensions of personality, self-concept, and socially desirable responding to compare variance components, generalizability coefficients, dependability coefficients, and proportions of universe score and measurement error variance using structural equation modeling versus ANOVA techniques. We further applied structural equation modeling techniques to continuous latent response variable metrics and derived Monte Carlo-based confidence intervals for those indices on both observed score and continuous latent response variable metrics. Results for observed scores estimated using structural equation modeling and ANOVA procedures seldom varied. Differences in reliability between raw score and continuous latent response variable metrics were much greater for scales with dichotomous responses, thereby highlighting the value of doing analyses on both metrics to evaluate gains that might be achieved by increasing response options. We provide detailed guidelines for applying the demonstrated techniques using structural equation modeling and ANOVA-based statistical software.
- Research Article
22
- 10.1037/met0000177
- Apr 1, 2019
- Psychological Methods
In this article, we illustrate ways in which generalizability theory (G-theory) can be used with continuous latent response variables (CLRVs) to address problems of scale coarseness resulting from categorization errors caused by representing ranges of continuous variables by discrete data points and transformation errors caused by unequal interval widths between those data points. The mechanism to address these problems is applying structural equation modeling (SEM) as a tool in deriving variance components needed to estimate indices of score consistency and validity. Illustrations include quantification of multiple sources of measurement error, use of non-nested and nested designs, derivation of indices of consistency for norm- and criterion-referenced interpretation of scores, estimation of effects when changing measurement procedures and designs, and disattenuation of correlation coefficients for measurement error. These illustrations underscore the effectiveness of G-theory with continuous latent response variables in providing stable indices of reliability and validity that are reasonably independent of the number of original scale points used, unevenness of scale intervals, and average degree of item skewness. We discuss general distinctions in reliability estimation within G-theory, SEM, and classical test theory; make specific recommendations for using G-theory on raw score and CLRV metrics; and provide computer code in an online supplement for doing all key analyses demonstrated in the article using R and Mplus. (PsycINFO Database Record (c) 2019 APA, all rights reserved).
- Dissertation
- 10.6342/ntu201802263
- Jan 1, 2018
Network meta-analysis is a crucial tool to combine direct and indirect evidence to compare the efficacy and harm between several treatments. Structural equation modeling is a statistical method that investigates relations among observed and latent variables. Previous studies have shown that the contrast-based Lu-Ades model for network meta-analysis can be implemented in the structural equation modeling framework. However, the Lu-Ades model uses the difference between treatments as the unit of analysis, thereby introducing correlations between observations and rendering data entry and model building a complex task. In this PhD thesis, we first demonstrated how to undertake network meta-analysis in structural equation modeling using the outcome of treatment arms as the unit of analysis (arm-parameterized model). As arm-parameterized model may introduce too many fixed effect parameters, leading to the incidental parameter problem, which could be resolved using parameter elimination via integration. We also showed that our models can include trials of within-person designs without the need for complex data manipulation. In addition, a novel approach to meta-analysis, the unrestricted weighted least squares, can be readily extended to network meta-analysis under our statistical framework. Based on the structural equation model we developed, we used multiple group analysis to evaluate inconsistency within network meta-analysis. We assessed two types of inconsistency: global inconsistency and direct-indirect evidence inconsistency. Global inconsistency was evaluated using the design inconsistency model, which can be constructed elegantly in structural equation modeling. Direct-indirect evidence inconsistency was further categorized into two approaches: design-oriented and information-oriented. For the design-oriented direct-indirect evidence inconsistency, we proposed two graphs, evidence network graphs and evidence comparison graph, to explore its structure. Based on the graph, we proposed an arm-parameterized inconsistency model that unifies current approaches to inconsistency evaluation. For the information-oriented direct-indirect evidence inconsistency, we proposed a series of evidence splitting models and showed that our model was the only model to truly reflect the difference between the direct information for the contrast of interest and the rest of the evidence network. These evidence splitting models requires flexible parameterizations and are estimable only in structural equation modeling. Finally, we set out to evaluate the performance of various estimation approaches for network meta-analysis under the sparse data scenario, i.e. when the event probability was low for binary data. Simulations revealed that the conventional empirical odds ratio, which is commonly used frequentist packages for network meta-analysis, suffered from severe bias toward the null under low event probability. Of all the current estimation methods, hierarchical generalized linear model was the most robust approach in terms of bias, control of type I error rate and power. For our structural equation modeling framework, the multivariate Peto odds ratio, which is an extension of Peto odds ratio in conventional meta-analysis, performed well when the effect size was not too large. We also proposed a centered multivariate Peto odds ratio whose estimates does not depend on the relative sample size of each treatment and can partly alleviate the bias under the scenario of unequal patient allocation across treatment arms. In conclusion, arm-parameterized model in structural equation modeling provides an extremely flexible framework for network meta-analysis, which can adapt to various model assumptions and assess their adequacy in network meta-analysis, such as inconsistency. Therefore, structural equation modeling has the potential to become a standard tool for network meta-analysis.
- Research Article
- 10.1007/s11336-021-09837-3
- Jan 29, 2022
- Psychometrika
In practice, it is common that a best fitting structural equation model (SEM) is selected from a set of candidate SEMs and inference is conducted conditional on the selected model. Such post-selection inference ignores the model selection uncertainty and yields too optimistic inference. Using the largest candidate model avoids model selection uncertainty but introduces a large variation. Jin and Ankargren (Psychometrika 84:84–104, 2019) proposed to use frequentist model averaging in SEM with continuous data as a compromise between model selection and the full model. They assumed that the true values of the parameters depend on n^{-1/2} with n being the sample size, which is known as a local asymptotic framework. This paper shows that their results are not directly applicable to SEM with ordinal data. To address this issue, we prove consistency and asymptotic normality of the polychoric correlation estimators under the local asymptotic framework. Then, we propose a new frequentist model averaging estimator and a valid confidence interval that are suitable for ordinal data. Goodness-of-fit test statistics for the model averaging estimator are also derived.
- Research Article
15
- 10.1186/s12874-018-0514-x
- Jun 19, 2018
- BMC Medical Research Methodology
BackgroundRealist approaches seek to answer questions such as ‘how?’, ‘why?’, ‘for whom?’, ‘in what circumstances?’ and ‘to what extent?’ interventions ‘work’ using context-mechanism-outcome (CMO) configurations. Quantitative methods are not well-established in realist approaches, but structural equation modelling (SEM) may be useful to explore CMO configurations. Our aim was to assess the feasibility and appropriateness of SEM to explore CMO configurations and, if appropriate, make recommendations based on our access to primary care research. Our specific objectives were to map variables from two large population datasets to CMO configurations from our realist review looking at access to primary care, generate latent variables where needed, and use SEM to quantitatively test the CMO configurations.MethodsA linked dataset was created by merging individual patient data from the English Longitudinal Study of Ageing and practice data from the GP Patient Survey. Patients registered in rural practices and who were in the highest deprivation tertile were included. Three latent variables were defined using confirmatory factor analysis. SEM was used to explore the nine full CMOs. All models were estimated using robust maximum likelihoods and accounted for clustering at practice level. Ordinal variables were treated as continuous to ensure convergence.ResultsWe successfully explored our CMO configurations, but analysis was limited because of data availability. Two hundred seventy-six participants were included. We found a statistically significant direct (context to outcome) or indirect effect (context to outcome via mechanism) for two of nine CMOs. The strongest association was between ‘ease of getting through to the surgery’ and ‘being able to get an appointment’ with an indirect mediated effect through convenience (proportion of the indirect effect of the total was 21%). Healthcare experience was not directly associated with getting an appointment, but there was a statistically significant indirect effect through convenience (53% mediated effect). Model fit indices showed adequate fit.ConclusionsSEM allowed quantification of CMO configurations and could complement other qualitative and quantitative techniques in realist evaluations to support inferences about strengths of relationships. Future research exploring CMO configurations with SEM should aim to collect, preferably continuous, primary data.
- Ask R Discovery
- Chat PDF