Abstract

Standard logistic regression analysis of case–control data has low power to detect gene–environment interactions, but until recently it was the only method that could be used on complex polygenic data for which parametric distributional models are not feasible. Under the assumption of gene–environment independence in the underlying population, Stalder et al. (Biometrika, 104:801–812, 2017) developed a retrospective method that treats both genetic and environmental variables nonparametrically. However, the mathematical symmetry of genetic and environmental variables is overlooked. We propose an improvement to the method of Stalder et al. that increases the efficiency of the estimates with no additional assumptions and modest computational cost. This improvement is achieved by treating the genetic and environmental variables symmetrically to generate two sets of parameter estimates that are combined to generate a more efficient estimate. We employ a semiparametric framework to develop the asymptotic theory of the estimator, show its asymptotic efficiency gain, and evaluate its performance via simulation studies. The method is illustrated using data from a case–control study of breast cancer.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call