Abstract
SummaryThe contribution of gene-by-environment (GxE) interactions for many human traits and diseases is poorly characterized. We propose a Bayesian whole-genome regression model for joint modeling of main genetic effects and GxE interactions in large-scale datasets, such as the UK Biobank, where many environmental variables have been measured. The method is called LEMMA (Linear Environment Mixed Model Analysis) and estimates a linear combination of environmental variables, called an environmental score (ES), that interacts with genetic markers throughout the genome. The ES provides a readily interpretable way to examine the combined effect of many environmental variables. The ES can be used both to estimate the proportion of phenotypic variance attributable to GxE effects and to test for GxE effects at genetic variants across the genome. GxE effects can induce heteroskedasticity in quantitative traits, and LEMMA accounts for this by using robust standard error estimates when testing for GxE effects. When applied to body mass index, systolic blood pressure, diastolic blood pressure, and pulse pressure in the UK Biobank, we estimate that , , , and , respectively, of phenotypic variance is explained by GxE interactions and that low-frequency variants explain most of this variance. We also identify three loci that interact with the estimated environmental scores ().
Highlights
Despite long standing interest in gene-by-environment (GxE) interactions,[1] this facet of genetic architecture remains poorly characterized in humans
Simulation Studies Genetic data was sub-sampled from the UK Biobank by default with N 1⁄4 25; 000 unrelated individuals of mixed ancestry and M 1⁄4 100; 000 genotyped SNPs
When there is a single true environmental score (ES) involved in GxE interactions, we found that Linear Environment Mixed Model Analysis (LEMMA) provided a substantial power increase (Figures 1 and S4)
Summary
Despite long standing interest in gene-by-environment (GxE) interactions,[1] this facet of genetic architecture remains poorly characterized in humans. Detection of GxE interactions is inherently more difficult than finding additive genetics in genome-wide association studies (GWASs). One difficulty is that of sample size: a commonly cited rule of thumb suggests that detection of interaction effects requires a sample size at least four times larger than that required to detect a main effect of comparable effect size.[2] Another difficulty is that an individual’s environment, which evolves through time, is very hard to measure in a comprehensive way and is inherently high dimensional. There are many environmental variables that could plausibly interact with the genome and many ways to combine them, and typically these factors are not all present in the same dataset. The recently released UK Biobank dataset, a large population cohort study with deep genotyping and sequencing and extensive phenotyping[3] offers a unique opportunity for the exploration of GxE effects.[4,5,6,7,8,9,10]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.