Abstract

As we move forward from the current generation of genome-wide association (GWA) studies, additional cohorts of different ancestries will be studied to increase power, fine map association signals, and generalize association results to additional populations. Knowledge of genetic ancestry as well as population substructure will become increasingly important for GWA studies in populations of unknown ancestry. Here we propose genotyping pooled DNA samples using genome-wide SNP arrays as a viable option to efficiently and inexpensively estimate admixture proportion and identify ancestry informative markers (AIMs) in populations of unknown origin. We constructed DNA pools from African American, Native Hawaiian, Latina, and Jamaican samples and genotyped them using the Affymetrix 6.0 array. Aided by individual genotype data from the African American cohort, we established quality control filters to remove poorly performing SNPs and estimated allele frequencies for the remaining SNPs in each panel. We then applied a regression-based method to estimate the proportion of admixture in each cohort using the allele frequencies estimated from pooling and populations from the International HapMap Consortium as reference panels, and identified AIMs unique to each population. In this study, we demonstrated that genotyping pooled DNA samples yields estimates of admixture proportion that are both consistent with our knowledge of population history and similar to those obtained by genotyping known AIMs. Furthermore, through validation by individual genotyping, we demonstrated that pooling is quite effective for identifying SNPs with large allele frequency differences (i.e., AIMs) and that these AIMs are able to differentiate two closely related populations (HapMap JPT and CHB).

Highlights

  • Genetic ancestry, as studied through DNA sequence variation, has shed light on the history, migration patterns, and relationships among human populations [1,2]

  • As we move forward from hypothesis-generating genome-wide association (GWA) studies, the research focus will start to shift to fine mapping of associated signals and/or pathways identified through such studies and will expand to include understudied diseases as well as studies in additional populations of unknown ancestry

  • For each pool, pooled allele frequencies (AF) were estimated as the proportion of angular distance observed for the pooled sample relative to that observed for the individual samples on the same plate, and averaged over all replicates

Read more

Summary

Introduction

As studied through DNA sequence variation, has shed light on the history, migration patterns, and relationships among human populations [1,2]. In the context of medical population genetics, genetic ancestry forms the basis of admixture mapping [3]. Genetic ancestry is useful for proper matching of cases and controls and is an important covariate to consider in association studies for complex human traits [4,5] as spurious associations around variants with large allele frequency differences between populations have long been recognized as potential confounders [6,7,8,9]. For admixed populations, having an estimated proportion of genetic ancestry attributable to each ancestral population (i.e., the admixture proportion) would

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call