Abstract

Growth mixture modeling is a popular analytic tool for longitudinal data analysis. It detects latent groups based on the shapes of growth trajectories. Traditional growth mixture modeling assumes that outcome variables are normally distributed within each class. When data violate this normality assumption, however, it is well documented that the traditional growth mixture modeling mislead researchers in determining the number of latent classes as well as in estimating parameters. To address nonnormal data in growth mixture modeling, robust methods based on various nonnormal distributions have been developed. As a new robust approach, growth mixture modeling based on conditional medians has been proposed. In this article, we present the results of two simulation studies that evaluate the performance of the median-based growth mixture modeling in identifying the correct number of latent classes when data follow the normality assumption or have outliers. We also compared the performance of the median-based growth mixture modeling to the performance of traditional growth mixture modeling as well as robust growth mixture modeling based on t distributions. For identifying the number of latent classes in growth mixture modeling, the following three Bayesian model comparison criteria were considered: deviance information criterion, Watanabe-Akaike information criterion, and leave-one-out cross validation. For the median-based growth mixture modeling and t-based growth mixture modeling, our results showed that they maintained quite high model selection accuracy across all conditions in this study (ranged from 87 to 100%). In the traditional growth mixture modeling, however, the model selection accuracy was greatly influenced by the proportion of outliers. When sample size was 500 and the proportion of outliers was 0.05, the correct model was preferred in about 90% of the replications, but the percentage dropped to about 40% as the proportion of outliers increased to 0.15.

Highlights

  • Growth mixture modeling has been widely used for longitudinal data analyses in social and behavioral research

  • Two simulation studies were conducted to answer the following research questions: 1) how well do Bayesian model comparison criteria used in a growth mixture model analysis correctly identify the number of latent classes when the population is heterogeneous and the normality assumption holds? and 2) how well does the median-based growth mixture modeling perform in identifying the correct number of latent classes when the population is heterogeneous and contains outliers? We examined the class enumeration performance of the median-based growth mixture modeling and compared it to that for the traditional growth mixture modeling and growth mixture modeling based on t-distributed measurement errors, which is known to be robust to nonnormal data in growth mixture modeling (Zhang et al, 2013; Lu and Zhang, 2014)

  • We examined the performance of DIC, WAIC, and LOO-CV used in the traditional GROWTH MIXTURE MODELS (GMMs), medianbased GMM, and t-based GMM when data followed the withinclass normality assumption

Read more

Summary

INTRODUCTION

Growth mixture modeling has been widely used for longitudinal data analyses in social and behavioral research. Traditional growth mixture modeling is built upon the assumption that latent growth factors and measurement errors are normally distributed. Two simulation studies were conducted to answer the following research questions: 1) how well do Bayesian model comparison criteria used in a growth mixture model analysis correctly identify the number of latent classes when the population is heterogeneous and the normality assumption holds? The first simulation study presents the performance of DIC, WAIC, and LOO-CV used in the three types of growth mixture models when data are normally distributed within each class. For the t-based growth mixture modeling approach, we assumed that the measurement errors follow a multivariate t distribution, εi ∼ MTT 0, Σg, ]g , where ]g is the degrees of freedom, 0 is the mean of εi, and Σg is a T × T scale matrix. The distribution of yit conditioning on big(0.5) can be written as yit big(0.5) ∼ LD(Λtbig(0.5), δg )

BAYESIAN ESTIMATION
MODEL SELECTION
SIMULATION STUDIES
Simulation Design
Estimation
Study 2
CONCLUSION AND DISCUSSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call