Abstract

While genome-wide association studies (GWAS) have primarily examined populations of European ancestry, more recent studies often involve additional populations, including admixed populations such as African Americans and Latinos. In admixed populations, linkage disequilibrium (LD) exists both at a fine scale in ancestral populations and at a coarse scale (admixture-LD) due to chromosomal segments of distinct ancestry. Disease association statistics in admixed populations have previously considered SNP association (LD mapping) or admixture association (mapping by admixture-LD), but not both. Here, we introduce a new statistical framework for combining SNP and admixture association in case-control studies, as well as methods for local ancestry-aware imputation. We illustrate the gain in statistical power achieved by these methods by analyzing data of 6,209 unrelated African Americans from the CARe project genotyped on the Affymetrix 6.0 chip, in conjunction with both simulated and real phenotypes, as well as by analyzing the FGFR2 locus using breast cancer GWAS data from 5,761 African-American women. We show that, at typed SNPs, our method yields an 8% increase in statistical power for finding disease risk loci compared to the power achieved by standard methods in case-control studies. At imputed SNPs, we observe an 11% increase in statistical power for mapping disease loci when our local ancestry-aware imputation framework and the new scoring statistic are jointly employed. Finally, we show that our method increases statistical power in regions harboring the causal SNP in the case when the causal SNP is untyped and cannot be imputed. Our methods and our publicly available software are broadly applicable to GWAS in admixed populations.

Highlights

  • Genome-wide association studies (GWAS) are the currently prevailing approach for identifying genetic variants with a modest effect on the risk of common disease, and have identified hundreds of common risk variants for a wide range of diseases and phenotypes [1,2]

  • This paper presents improved methodologies for the analysis of genome-wide association studies in admixed populations, which are populations that came about by the mixing of two or more distant continental populations over a few hundred years (e.g., African Americans or Latinos)

  • Correlation between genetic variants exists both at a fine scale in the ancestral populations and at a coarse scale due to chromosomal segments of distinct ancestry

Read more

Summary

Introduction

Genome-wide association studies (GWAS) are the currently prevailing approach for identifying genetic variants with a modest effect on the risk of common disease, and have identified hundreds of common risk variants for a wide range of diseases and phenotypes [1,2]. GWAS disease mapping in homogeneous populations relies on linkage disequilibrium (LD) between nearby markers to identify SNP association [5]. Admixed populations exhibit another form of LD at a coarse scale (admixture-LD) due to chromosomal segments of distinct ancestry [6]. GWAS and admixture mapping have historically been viewed as distinct approaches, GWAS in admixed populations can in theory capture both SNP and admixture association signals, which have been shown to contain independent information [15]. We show below that combining these signals leads to increased statistical power because case-only admixture association statistics contain information independent from case-control SNP association statistics

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call