Abstract
Data pooling is a powerful strategy in empirical research. However, combining multiple datasets often results in a large amount of missing data, as variables that are not present in some datasets effectively contain missing values for all participants in those datasets. Furthermore, data pooling typically leads to a mix of continuous and categorical items with nonnormal multivariate distributions. We investigated two popular approaches to handle missing data in this context: (1) applying direct maximum likelihood by treating data as continuous (con-ML), and (2) applying categorical least squares using a polychoric correlation matrix computed from pairwise deletion (cat-LS). These approaches are available for free and relatively straightforward for empirical researchers to implement. Through simulation studies with confirmatory factor analysis and latent mediation analysis, we found cat-LS to be unsuitable for pooled data analysis, whereas con-ML yielded acceptable performance for the estimation of latent path coefficients barring severe nonnormality.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Structural Equation Modeling: A Multidisciplinary Journal
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.