Abstract

In many epidemiological and environmental health studies, developing an accurate exposure assessment of multiple exposures on a health outcome is often of interest. However, the problem is challenging in the presence of multicollinearity, which can lead to biased estimates of regression coefficients and inflated variance estimators. Selecting one exposure variable as a surrogate of multiple highly correlated exposure variables is often suggested in the literature as a solution to handle the multicollinearity problem. However, this may lead to loss of information, since the exposure variables that are highly correlated tend to have not only common but also additional effects on the outcome variable. In this study, a two-stage latent factor regression method is proposed. The key idea is to regress the dependent variable not only on the common latent factor(s) of the explanatory variables, but also on the residuals terms from the factor analysis as the explanatory variables. The proposed method is compared to the traditional latent factor regression and principal component regression for their performance of handling multicollinearity. Two case studies are presented. Simulation studies are performed to assess their performances in terms of the epidemiological interpretation and stability of parameter estimates.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call