Abstract

Psychologists use scales comprised of multiple items to measure underlying constructs. Missing data on such scales often occur at the item level, whereas the model of interest to the researcher is at the composite (scale score) level. Existing analytic approaches cannot easily accommodate item-level missing data when models involve composites. A very common practice in psychology is to average all available items to produce scale scores. This approach, referred to as available-case maximum likelihood (ACML), may produce biased parameter estimates. Another approach researchers use to deal with item-level missing data is scale-level full information maximum likelihood (SL-FIML), which treats the whole scale as missing if any item is missing. SL-FIML is inefficient and it may also exhibit bias. Multiple imputation (MI) produces the correct results using a simulation-based approach. We study a new analytic alternative for item-level missingness, called two-stage maximum likelihood (TSML; Savalei & Rhemtulla, Journal of Educational and Behavioral Statistics, 42(4), 405–431. 2017). The original work showed the method outperforming ACML and SL-FIML in structural equation models with parcels. The current simulation study examined the performance of ACML, SL-FIML, MI, and TSML in the context of univariate regression. We demonstrated performance issues encountered by ACML and SL-FIML when estimating regression coefficients, under both MCAR and MAR conditions. Aside from convergence issues with small sample sizes and high missingness, TSML performed similarly to MI in all conditions, showing negligible bias, high efficiency, and good coverage. This fast analytic approach is therefore recommended whenever it achieves convergence. R code and a Shiny app to perform TSML are provided.

Highlights

  • Psychologists across many subfields often use measures that are composed of multiple items

  • When data are missing at the item level, scale-level full-information maximum likelihood (FIML) (SL-FIML; Savalei & Rhemtulla, 2017) is the approach that uses listwise deletion to compute scale scores followed by FIML at the composite level

  • These results suggest that practical applications of two-stage maximum likelihood (TSML) may require the use of more sophisticated implementations of the EM algorithm, especially for small sample sizes

Read more

Summary

Introduction

Psychologists across many subfields often use measures that are composed of multiple items. The composite scale scores computed from these items are frequently used in analyses such as regression. This application is found in a wide range of research topics from the relationship between personality types and depression (Dhondt et al, 2013), mind-wandering and attention deficit disorder (Seli, Smallwood, Cheyne, & Smilek, 2015), to internet and smartphone addiction (Choi et al, 2015). Participants answering an inventory questionnaire may refuse to answer questions that they deem too sensitive, leave items blank when they do not apply, quit the questionnaire early because it is too long, or skip items due to carelessness. Item-level missing data presents a difficult problem when the researcher is interested in fitting a model at the composite level, which requires thecomputation of composite scores, because it is not straightforward to compute such scores in the presence of missing data

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call