ABSTRACTPartial least squares path modeling (PLS‐PM) has become popular in various disciplines to model structural relationships among latent variables measured by manifest variables. To fully benefit from the predictive capabilities of PLS‐PM, researchers must understand the efficacy of predictive metrics used. In this research, we compare the performance of standard PLS‐PM criteria and model selection criteria derived from Information Theory, in terms of selecting the best predictive model among a cohort of competing models. We use Monte Carlo simulation to study this question under various sample sizes, effect sizes, item loadings, and model setups. Specifically, we explore whether, and when, the in‐sample measures such as the model selection criteria can substitute for out‐of‐sample criteria that require a holdout sample. Such a substitution is advantageous when creating a holdout causes considerable loss of statistical and predictive power due to an overall small sample. We find that when the researcher does not have the luxury of a holdout sample, and the goal is selecting correctly specified models with low prediction error, the in‐sample model selection criteria, in particular the Bayesian Information Criterion (BIC) and Geweke–Meese Criterion (GM), are useful substitutes for out‐of‐sample criteria. When a holdout sample is available, the best performing out‐of‐sample criteria include the root mean squared error (RMSE) and mean absolute deviation (MAD). We recommend against using standard the PLS‐PM criteria (R2, AdjustedR2, andQ2), and specifically the out‐of‐sample mean absolute percentage error (MAPE) for prediction‐oriented model selection purposes. Finally, we illustrate the model selection criteria's practical utility using a well‐known corporate reputation model.
Read full abstract