Abstract

Introduction: Genome-wide association studies (GWAS) of blood lipids have yielded several translational insights for coronary artery disease. Hypothesis: We tested the hypothesis that novel Bayesian methods leveraging genetic correlation can improve insights from lipid genetics analyses. Methods: We used published lipids GWAS results from 297,626 participants of the Million Veterans Program (MVP), and generated summary statistics for 276,096 participants of the UK Biobank. We employed a multivariate adaptive shrinkage (mashR) model to each dataset. MashR allows the effect of a SNP to be modeled as a mixture of multivariate normal distributions across lipid levels (LDL-C, HDL-C, Total Cholesterol, and triglycerides) and an empirical Bayes approach to appropriately nudge the posterior estimates of the effect in accordance with the overall patterns observed in the data. Results: In addition to reproducing all 318 loci significantly associated with lipids in MVP, we identified 4,689 novel loci associated with lipids using mashR. We observed a high degree of agreement between MVP and UK Biobank, with over 95% of UKBB replicated in MVP and an additional 1246 identified. In genetic prioritization analyses, mashR shows improvement in the detection of known Mendelian genes and the consistency of ranks among subgroups compared with univariate results. Our approach improves the estimate of enrichment parameters of known genomic features similar to established GWAS variants. Finally, combining these joint estimates with tools for incorporating the genetic correlation structure, mashR improves the proportion of variation explained by lipid polygenic risk scores by up to 58% (se 0.015) when compared to univariate substrates across lipid traits. This improvement holds across both European and non-European individuals. We show that much of this gain in power arises from improved precision of the posterior estimates where the ratio of original standard error over posterior marginal variance reflects an increased ‘effective sample size’. Conclusions: Our framework shows that Bayesian multivariate genetic analysis markedly improves power for genomic discovery, better identifies causal genes, and improves polygenic risk prediction for complex traits.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call