Large-scale gene-environment interaction (GxE) discovery efforts often involve analytical compromises for the sake of data harmonization and statistical power. Refinement of exposures, covariates, outcomes, and population subsets may be helpful to establish often-elusive replication and evaluate potential clinical utility. Here, we used additional datasets, an expanded set of statistical models, and interrogation of lipoprotein metabolism via nuclear magnetic resonance (NMR)-based lipoprotein subfractions to refine a previously discovered GxE modifying the relationship between physical activity (PA) and HDL-cholesterol (HDL-C). We explored this GxE in the Women's Genome Health Study (WGHS; N = 23,294; the strongest cohort-specific signal in the original meta-analysis), the UK Biobank (UKB; N = 281,380), and the Multi-Ethnic Study of Atherosclerosis (MESA; N = 4587), using self-reported PA (MET-min/wk) and genotypes at rs295849 (nearest gene: LHX1). As originally reported, minor allele carriers of rs295849 in WGHS had a stronger positive association between PA and HDL-C (pint = 0.002). When testing available NMR metabolites to refine the HDL-C outcome, we found a stronger interaction effect on medium-sized HDL particle concentrations (M-HDL-P; pint = 1.0 × 10-4) than HDL-C. Meta-regression revealed a systematically larger interaction effect in cohorts from the original meta-analysis with a greater fraction of women (p = 0.018). In the UKB, GxE effects were stronger in women and using M-HDL-P as the outcome. In MESA, the primary interaction for HDL-C showed nominal significance (pint = 0.013), but without clear sex differences and with a greater magnitude for large HDL-P. Our work provides additional insights into a known gene-PA interaction while illustrating the importance of phenotype and model refinement toward understanding and replicating GxEs.
Read full abstract