Abstract

Population stratification is always a concern in association analysis. There is a debate on the extent of the problem in less extreme situations (Thomas and Witte [1], Wacholder et al. [2]). Wacholder et al. [3] and Ardlie et al. [4] showed that hidden population structure is not a serious threat to case-control designs. We propose a method of assessing the seriousness of the population stratification before designing association studies. If population stratification is not a serious problem, one may consider using case-control study instead of family-based design to get more power. In a case-control design, we compare chi-square statistics from a structured population (a union of two subpopulations) and a homogeneous population with the same prevalence and allele frequencies. We provide an explicit formula to calculate the chi-square statistics from 17 parameters, such as proportions of subpopulation, allele frequencies in subpopulations, etc. We choose these factors because they have potential to cause false associations. Each parameter takes a random value in a chosen range. We then calculate the likelihood of getting opposite conclusions in the structured and the homogeneous populations. This is the likelihood of having false positives caused by population stratification. The advantage of this method is to provide a cost effective way to choose between using case-control data and using family data before actually collecting those data. We conclude that sample sizes have a significant effect on the likelihood of false positive caused by population stratification. The larger the sample size is, the more likely to have false positive if the population structure is ignored. If the sample size will be smaller than 200 by budget constraints, then case-control study may be a better choice because of its power.

Highlights

  • After Human Genome Project, the studies of genetic variation in human population have been developed extensively [5] [6]

  • We conclude that sample sizes have a significant effect on the likelihood of false positive caused by population stratification

  • We provide a formula for calculating the likelihood of false positive caused by population stratification given the ranges of the parameters

Read more

Summary

Introduction

After Human Genome Project, the studies of genetic variation in human population have been developed extensively [5] [6]. Population stratification is always a serious concern in association analysis [7] [8]. Thomas and Witte [1] gave a good summary about the problem To avoid this problem, many family-based methods were proposed, which includes TDT (Spielman et al [10]) and its extensions. Shin and Lee [13] proposed a mixed model to reduce spurious genetic associations produced by population stratification in genome-wide association studies. Some studies have been conducted to explore associations between some common SNPs and social deprivation measure of socio-economic status, which have to deal with structured population data [17]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.