Abstract

BackgroundVariable selection is an important issue in many fields such as public health and psychology. Researchers often gather data on many variables of interest and then are faced with two challenging goals: building an accurate model with few predictors, and making probabilistic statements (inference) about this model. Unfortunately, it is currently difficult to attain these goals with the two most popular methods for variable selection methods: stepwise selection and LASSO. The aim of the present study was to demonstrate the use predictive projection feature selection – a novel Bayesian variable selection method that delivers both predictive power and inference. We apply predictive projection to a sample of New Zealand young adults, use it to build a compact model for predicting well-being, and compare it to other variable selection methods.MethodsThe sample consisted of 791 young adults (ages 18 to 25, 71.7% female) living in Dunedin, New Zealand who had taken part in the Daily Life Study in 2013–2014. Participants completed a 13-day online daily diary assessment of their well-being and a range of lifestyle variables (e.g., sleep, physical activity, diet variables). The participants’ diary data was averaged across days and analyzed cross-sectionally to identify predictors of average flourishing. Predictive projection was used to select as few predictors as necessary to approximate the predictive accuracy of a reference model with all 28 predictors. Predictive projection was also compared to other variable selection methods, including stepwise selection and LASSO.ResultsThree predictors were sufficient to approximate the predictions of the reference model: higher sleep quality, less trouble concentrating, and more servings of fruit. The performance of the projected submodel generalized well. Compared to other variable selection methods, predictive projection produced models with either matching or slightly worse performance; however, this performance was achieved with much fewer predictors.ConclusionPredictive projection was used to efficiently arrive at a compact model with good predictive accuracy. The predictors selected into the submodel – felt refreshed after waking up, had less trouble concentrating, and ate more servings of fruit – were all theoretically meaningful. Our findings showcase the utility of predictive projection in a practical variable selection problem.

Highlights

  • Variable selection is an important issue in many fields such as public health and psychology

  • There were few outliers with very low average daily flourishing in the data; when we evaluated the model via PSIS-LOO cv, we found no evidence of these observations having a disproportionate influence on the model fit, as indicated by satisfactory Pareto-k values

  • We found that the submodel based on the 1SE rule, which included only 3 predictors, made predictions that were similar enough to those of the reference model with all 28 predictors

Read more

Summary

Introduction

Variable selection is an important issue in many fields such as public health and psychology. Researchers often gather data on many variables of interest and are faced with two challenging goals: building an accurate model with few predictors, and making probabilistic statements (inference) about this model. We apply predictive projection to a sample of New Zealand young adults, use it to build a compact model for predicting well-being, and compare it to other variable selection methods. In health and well-being research, researchers may collect data on many demographic, lifestyle, and psychological variables, and aim to build a compact model with fewer variables that can accurately predict the participants’ self-reported well-being. Researchers should be able make probabilistic statements about it – how uncertain is the selection, how variable is the model’s performance, and, perhaps most importantly, how strong and reliable are the relationships between the selected predictors and the outcome [69]. Variable selection should produce models that simultaneously provide both of these important functions: predictive power and inference

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call