Abstract

Accurately resolving population structure in a sample is important for both linkage and association studies. In this study we investigated the power of single-nucleotide polymorphisms (SNPs) in detecting population structure in a sample of 286 unrelated individuals. We varied the number of SNPs to determine how many are required to approach the degree of resolution obtained with the Collaborative Study on the Genetics of Alcoholism (COGA) short tandem repeat polymorphisms (STRPs). In addition, we selected SNPs with varying minor allele frequencies (MAFs) to determine whether low or high frequency SNPs are more efficient in resolving population structure. We conclude that a set of at least 100 evenly spaced SNPs with MAFs of 40–50% is required to resolve population structure in this dataset. If SNPs with lower MAFs are used, then more than 250 SNPs may be required to obtain reliable results.

Highlights

  • Resolving population structure in a sample is important for both linkage and association studies

  • The self-reported race of these 286 individuals was as follows: 245 European Americans, 26 African Americans, 12 European American/Hispanics, and 3 African American/Hispanics. Each of these individuals was genotyped for the 328 short tandem repeat polymorphisms (STRPs) from Collaborative Study on the Genetics of Alcoholism (COGA), 4,720 single-nucleotide polymorphisms (SNPs) from the Illumina linkage panel, and 11,120 SNPs from the Affymetrix mapping array which were prepared for Genetic Analysis Workshop 14 [5]

  • The results of the STRP and 1,000 SNP runs in STRUCTURE were in concordance with self-reported race for all the European Americans and African Americans (Table 2)

Read more

Summary

Introduction

Resolving population structure in a sample is important for both linkage and association studies. Differences in population structure between cases and controls can result in high rates of both type I and type II errors [e.g., [1,2,3]]. One hypothesis is that those SNPs with high MAFs predate the origins of modern human races and carry little useful information about population structure. It follows that SNPs with low MAFs, being much more recent polymorphisms, may be more informative in resolving population structure. The low heterozygosity of these SNPs may limit their usefulness (since the allele frequency differences (page number not for citation purposes)

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call