Abstract

Predisposition to a disease is usually caused by cumulative effects of a multitude of exposures and lifestyle factors in combination with individual susceptibility. Failure to include all relevant variables may result in biased risk estimates and decreased power, whereas inclusion of all variables may lead to computational difficulties, especially when variables are correlated. We describe a Bayesian Mixture Model (BMM) incorporating a variable-selection prior and compared its performance with logistic multiple regression model (LM) in simulated case-control data with up to twenty exposures with varying prevalences and correlations. In addition, as a practical example we re analyzed data on male infertility and occupational exposures (Chaps-UK). BMM mean-squared errors (MSE) were smaller than of the LM, and were independent of the number of model parameters. BMM type I errors were minimal (≤1), whereas for the LM this increased with the number of parameters and correlation between exposures. The numbers of type II errors were comparable. Re analysis of Chaps-UK data demonstrated more convincingly than by using a LM that occupational exposure to glycol ethers and VOCs are likely risk factors for male infertility. This BMM proves an appealing alternative to standard logistic regression when dealing with the analysis of (correlated) exposures in case-control studies.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call