Probabilistic inference of the genetic architecture underlying functional enrichment of complex traits

Marion Patxot,Daniel Trejo Banos,Etienne J Orliac,Gerhard Moser,Reedik Mägi,Matthew R Robinson,Alexander Holloway,Julia Sidorenko,Zoltan Kutalik,Peter M Visscher,Sven E Ojavee,Athanasios Kousathanas,Lars Rönnegård

doi:10.1038/s41467-021-27258-9

Abstract

We develop a Bayesian model (BayesRR-RC) that provides robust SNP-heritability estimation, an alternative to marker discovery, and accurate genomic prediction, taking 22 seconds per iteration to estimate 8.4 million SNP-effects and 78 SNP-heritability parameters in the UK Biobank. We find that only ≤10% of the genetic variation captured for height, body mass index, cardiovascular disease, and type 2 diabetes is attributable to proximal regulatory regions within 10kb upstream of genes, while 12-25% is attributed to coding regions, 32–44% to introns, and 22-28% to distal 10-500kb upstream regions. Up to 24% of all cis and coding regions of each chromosome are associated with each trait, with over 3,100 independent exonic and intronic regions and over 5,400 independent regulatory regions having ≥95% probability of contributing ≥0.001% to the genetic variance of these four traits. Our open-source software (GMRM) provides a scalable alternative to current approaches for biobank data.

Highlights

We develop a Bayesian model (BayesRR-RC) that provides robust single nucleotide polymorphism (SNP)-heritability estimation, an alternative to marker discovery, and accurate genomic prediction, taking 22 seconds per iteration to estimate 8.4 million SNP-effects and 78 SNP-heritability parameters in the UK Biobank
We have genetic markers grouped into minor allele frequency (MAF)-LD-annotation specific sets, with independent hyper-parameters for the phenotypic variance attributable to each group, so that the mixture proportions, the variance explained by the SNP markers, and the mixture constants are all unique and independent across SNP marker groups
We find that 32–44% of the h2SNP is attributable to intronic regions, 12–25% is attributable to exonic regions, 22–28% is attributable to markers 10–500 kb upstream of genes, with proximal promotors, enhancers and transcription factor binding sites cumulatively contributing

Summary

Introduction

We develop a Bayesian model (BayesRR-RC) that provides robust SNP-heritability estimation, an alternative to marker discovery, and accurate genomic prediction, taking 22 seconds per iteration to estimate 8.4 million SNP-effects and 78 SNP-heritability parameters in the UK Biobank. As large-scale biobank data is increasingly available, methods that provide joint estimates of the marker effects in a single step by estimating the effect sizes as random under flexible prior formulations may become beneficial as they: (i) can account for differences in the variance contributed across MAF, LD or annotation groups providing unbiased MAF-LD annotation-specific genetic effect size estimates and h2SNP of different annotations, allowing for a contrasting of the genetic architectures of complex traits; (ii) give the probability that each marker, genomic region, annotation, genecoding region, or SNP is associated with a phenotype, alongside the proportion of phenotypic variation contributed by each, yielding test statistics that describe the gene architecture of complex traits and the uncertainty over the estimates; and (iii) provide improved genomic prediction, whilst providing a posterior predictive distribution for each individual. We validate our approach in large-scale simulation study and provide an empirical example using four traits measured in both the UK Biobank and Estonian Biobank data

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Nature communications	Publication Date: Nov 30, 2021
Citations: 19	License type: open-access

R Discovery Prime

R Discovery Prime

Probabilistic inference of the genetic architecture underlying functional enrichment of complex traits

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Nature communications

Lead the way for us

Similar Papers

Effect of Acid Suppressants on the Risk of COVID-19: A Propensity Score-Matched Study Using UK Biobank
Xiude Fan ... Laura E Nagy
Gastroenterology | VOL. 160
Xiude Fan, et. al.Xiude Fan ... Laura E Nagy
24 Sep 2020
Gastroenterology | VOL. 160

Association of asthma and its genetic predisposition with the risk of severe COVID-19
Zhaozhong Zhu ... Liming Liang
Journal of Allergy and Clinical Immunology | VOL. 146
Zhaozhong Zhu, et. al.Zhaozhong Zhu ... Liming Liang
06 Jun 2020
Journal of Allergy and Clinical Immunology | VOL. 146

Is COVID-19 infection more severe in kidney transplant recipients?
Sophie Caillard ...
American Journal of Transplantation | VOL. 21
Sophie Caillard, et. al.Sophie Caillard ...
28 Jan 2021
American Journal of Transplantation | VOL. 21

Assessment of a causal relationship between body mass index and atopic dermatitis
Ashley Budu-Aggrey ... Sara J Brown
Journal of Allergy and Clinical Immunology | VOL. 147
Ashley Budu-Aggrey, et. al.Ashley Budu-Aggrey ... Sara J Brown
17 May 2020
Journal of Allergy and Clinical Immunology | VOL. 147

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Probabilistic inference of the genetic architecture underlying functional enrichment of complex traits

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Nature communications