Quantifying the mapping precision of genome-wide association studies using whole-genome sequencing data

Yang Wu,Zhili Zheng,Jian Yang,Peter M Visscher

doi:10.1186/s13059-017-1216-0

Abstract

BackgroundUnderstanding the mapping precision of genome-wide association studies (GWAS), that is the physical distances between the top associated single-nucleotide polymorphisms (SNPs) and the causal variants, is essential to design fine-mapping experiments for complex traits and diseases.ResultsUsing simulations based on whole-genome sequencing (WGS) data from 3642 unrelated individuals of European descent, we show that the association signals at rare causal variants (minor allele frequency ≤ 0.01) are very unlikely to be mapped to common variants in GWAS using either WGS data or imputed data and vice versa. We predict that at least 80% of the common variants identified from published GWAS using imputed data are within 33.5 Kbp of the causal variants, a resolution that is comparable with that using WGS data. Mapping precision at these loci will improve with increasing sample sizes of GWAS in the future. For rare variants, the mapping precision of GWAS using WGS data is extremely high, suggesting WGS is an efficient strategy to detect and fine-map rare variants simultaneously. We further assess the mapping precision by linkage disequilibrium between GWAS hits and causal variants and develop an online tool (gwasMP) to query our results with different thresholds of physical distance and/or linkage disequilibrium (http://cnsgenomics.com/shiny/gwasMP).ConclusionsOur findings provide a benchmark to inform future design and development of fine-mapping experiments and technologies to pinpoint the causal variants at GWAS loci.

Highlights

Understanding the mapping precision of genome-wide association studies (GWAS), that is the physical distances between the top associated single-nucleotide polymorphisms (SNPs) and the causal variants, is essential to design fine-mapping experiments for complex traits and diseases
The simulations were based on whole-genome sequencing (WGS) data on 3642 unrelated individuals and ~17.6 million genetic variants from the UK10K project [7] after quality controls (QC)
Wu et al Genome Biology (2017) 18:86 simulation replicate, we randomly sampled a sequence variant as causal variant to generate a phenotype and performed genome-wide association analyses of the simulated phenotype using genotype data from four different genotyping/imputation strategies: (1) WGS data; (2) SNP-array data imputed to HapMap phase 2 [8] (HapMap2); (3) SNP-array data imputed to 1000 Genomes Project [9] (1KGP) phase 1 (1KGP1); (4) SNP-array data imputed to 1KGP phase 3 (1KGP3)

Summary

Introduction

Understanding the mapping precision of genome-wide association studies (GWAS), that is the physical distances between the top associated single-nucleotide polymorphisms (SNPs) and the causal variants, is essential to design fine-mapping experiments for complex traits and diseases. There are a few studies that have been able to pinpoint the causal variant and/or the functional gene(s) at a GWAS locus [2,3,4,5] These examples, are rare to date, and high-throughput experiments and technologies are in high demand to fine-map the causal variants and/or genes at the GWAS loci [6]. Understanding the distribution of the distances between the top associated variants in GWAS and the underlying causal variants is essential to design and develop such fine-mapping experiments and technologies. We seek to quantify the empirical distribution of physical distances between GWAS hits and causal variants for different genotyping strategies using simulations

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Genome Biology	Publication Date: May 16, 2017
Citations: 90	License type: open-access

R Discovery Prime

R Discovery Prime

Quantifying the mapping precision of genome-wide association studies using whole-genome sequencing data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Genome Biology

Lead the way for us

Similar Papers

DASH: A Method for Identical-by-Descent Haplotype Mapping Uncovers Association with Recent Variation
Alexander Gusev ... Itsik Pe'Er
The American Journal of Human Genetics | VOL. 88
Alexander Gusev, et. al.Alexander Gusev ... Itsik Pe'Er
27 May 2011
The American Journal of Human Genetics | VOL. 88

Sequence Kernel Association Tests for the Combined Effect of Rare and Common Variants
Iuliana Ionita-Laza ... Xihong Lin
The American Journal of Human Genetics | VOL. 92
Iuliana Ionita-Laza, et. al.Iuliana Ionita-Laza ... Xihong Lin
16 May 2013
The American Journal of Human Genetics | VOL. 92

Extending Rare-Variant Testing Strategies: Analysis of Noncoding Sequence and Imputed Genotypes
Matthew Zawistowski ... Sebastian Zöllner
The American Journal of Human Genetics | VOL. 87
Matthew Zawistowski, et. al.Matthew Zawistowski ... Sebastian Zöllner
01 Nov 2010
The American Journal of Human Genetics | VOL. 87

Key variants via the Alzheimer's Disease Sequencing Project whole genome sequence data.
Achilleas N Pitsillides ... Yanbing Wang
Alzheimer's & dementia : the journal of the Alzheimer's Association | VOL. 20
Achilleas N Pitsillides, et. al.Achilleas N Pitsillides ... Yanbing Wang
21 Mar 2024
Alzheimer's & dementia : the journal of the Alzheimer's Association | VOL. 20

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Quantifying the mapping precision of genome-wide association studies using whole-genome sequencing data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Genome Biology