Evaluating Fit Indices for Multivariate t-Based Structural Equation Modeling with Data Contamination.

Mark H. C. Lai,Jiaqi Zhang

doi:10.3389/fpsyg.2017.01286

Mark H. C. Lai, Jiaqi Zhang

Open Access

https://doi.org/10.3389/fpsyg.2017.01286

Copy DOI

Journal: Frontiers in psychology	Publication Date: Jul 28, 2017
Citations: 8	License type: cc-by

Affiliation: University of Cincinnati

Abstract

In conventional structural equation modeling (SEM), with the presence of even a tiny amount of data contamination due to outliers or influential observations, normal-theory maximum likelihood (ML-Normal) is not efficient and can be severely biased. The multivariate-t-based SEM, which recently got implemented in Mplus as an approach for mixture modeling, represents a robust estimation alternative to downweigh the impact of outliers and influential observations. To our knowledge, the use of maximum likelihood estimation with a multivariate-t model (ML-t) to handle outliers has not been shown in SEM literature. In this paper we demonstrate the use of ML-t using the classic Holzinger and Swineford (1939) data set with a few observations modified as outliers or influential observations. A simulation study is then conducted to examine the performance of fit indices and information criteria under ML-Normal and ML-t in the presence of outliers. Results showed that whereas all fit indices got worse for ML-Normal with increasing amount of outliers and influential observations, their values were relatively stable with ML-t, and the use of information criteria was effective in selecting ML-normal without data contamination and selecting ML-t with data contamination, especially when the sample size was at least 200.

Highlights

Previous studies have shown that the use of robust covariance matrix based on weights corresponding to a multivariate t distribution provided good parameter estimates and likelihood ratio test statistic (LRT) statistics similar to those obtained without outliers under ML-Normal (Yuan and Bentler, 1998b), to our knowledge no analytic and simulation studies have evaluated the performance of LRT and fit indices obtained under the multivariate-t-based structural equation modeling (SEM) as implemented in Mplus (i.e., MLt), with df being estimated instead of specified by users
As pointed out in Yuan and Zhong (2013), unlike general statistics software where diagnostic tools for outliers and influential observations are common, such tools are rarely accessible for SEM software, partly because of the complexity of SEM modeling
Whereas robust SEM using Hubertype weights has been developed and shown to perform well, and the rsem package is freely available in R, many researchers are more familiar with other commonly used SEM software packages such as Mplus, and so it is important to have comparable tools for handling outliers and influential observations in other software

Summary

OUTLIERS AND INFLUENTIAL OBSERVATIONS

Whereas topics related to outliers, or more generally data contamination, are commonly discussed in quantitative research methodology textbooks, in practice researchers do not always agree on their definitions and how best to handle them. In regression with only one response variable, outliers are cases with a large deviation from its predicted value based on the regression line In multivariate analyses such as SEM, the distance of an observation from the center of most of the data points is commonly quantified by the Mahalanobis distance (d), where: di = (yi − μ)⊤ (yi − μ),. As discussed in Yuan and Zhong (2008), outliers in SEM have large values of e, and will inflate the covariance matrix of the outcome variables. It may or may not have large values in η. Influential observations can be good or bad: good influential observations have extreme ξ but not extreme e values, and will not negatively impact model fit as it is not considered outliers; bad influential observations, on the other hand, have both extreme ξ and e values, and will negatively impact model fit

Impact of Outliers and Influential Observations

Existing Robust Estimation Methods in SEM

REAL DATA DEMONSTRATION

SIMULATION STUDY

Simulation Results

DISCUSSION

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Evaluating Fit Indices for Multivariate t-Based Structural Equation Modeling with Data Contamination.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in psychology

Lead the way for us

Similar Papers

Confirmatory Factor Analyses in Psychological Test Adaptation and Development
Kay Brauer ... Jochen Ranger
Psychological Test Adaptation and Development | VOL. 4
Kay Brauer, et. al.Kay Brauer ... Jochen Ranger
01 Feb 2023
Psychological Test Adaptation and Development | VOL. 4

Evaluation and Comparison of SEM, ESEM, and BSEM in Estimating Structural Models with Potentially Unknown Cross-loadings
Xiayan Wei ... Junhao Pan
Structural Equation Modeling: A Multidisciplinary Journal | VOL. 29
Xiayan Wei, et. al.Xiayan Wei ... Junhao Pan
07 Feb 2022
Structural Equation Modeling: A Multidisciplinary Journal | VOL. 29

Effects of Potential Confounding Factors on Fit Indices and Parameter Estimates for True and Misspecified SEM Models
Xitao Fan ... Lin Wang
Educational and Psychological Measurement | VOL. 58
Xitao Fan, et. al.Xitao Fan ... Lin Wang
01 Oct 1998
Educational and Psychological Measurement | VOL. 58

5. Finite Normal Mixture SEM Analysis by Fitting Multiple Conventional SEM Models
Ke-Hai Yuan ... Peter M Bentler
Sociological Methodology | VOL. 40
Ke-Hai Yuan, et. al.Ke-Hai Yuan ... Peter M Bentler
04 May 2010
Sociological Methodology | VOL. 40

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Evaluating Fit Indices for Multivariate t-Based Structural Equation Modeling with Data Contamination.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in psychology