Abstract

Nearly one thousand human genome wide association studies (GWAS) have examined over 210 diseases and traits and found over 1,200 SNP associations. With improved genotyping technologies and the growing number of available markers, case-control Genome Wide Association Studies (GWAS) have become a key tool for investigating complex diseases. This study assesses the influence of genotype and diagnosis errors present in GWAS by analyzing a synthetic gene dataset incorporating factors known to influence association measurement. Monte Carlo methods were used to generate the synthetic gene data, which incorporated factors including gene inheritance, relative risk levels, disease penetrance, genotype distribution, sample size, as well as the two error factors that are the focus of this study. The resulting dataset provides a truth set for assessing statistical method performance and association sensitivity. While previously understood, these results quantify and document the extent of the relationship between genotype and diagnosis error measures and statistical power loss. Our results also demonstrate that for low risk non-recessive loci, sample sizes in the range of 1,000 - 2,000 cases will achieve 80% power thresholds for error type I error levels of 10-8 even with realistic genotype and phenotype error assumptions. Nevertheless, compensating for power loss due to the presence of genotype and diagnosis errors by increasing sample size should not be underestimated. Our estimates indicate that sample size increase requirements are in the range of 20% to 40%, depending on the gene inheritance model assumed.

Highlights

  • Over 900 human genome wide association studies (GWAS) have examined over 210 diseases and traits and found over 1,200 single nucleotide polymorphism (SNP) associations [1]

  • With improved genotyping technologies and the growing number of available markers, case-control GWAS have become a key tool for investigating complex diseases

  • Gordon et al [6] analyzed the influence of both random phenotype and genotype misclassification errors on statistical power contrasting the Cochran Armitage Trend test (CA-A) with the 2 df genotype test and concluded that the CA-A is more powerful

Read more

Summary

Introduction

Over 900 human genome wide association studies (GWAS) have examined over 210 diseases and traits and found over 1,200 SNP associations [1]. Edwards et al [5] presented a quantification of the effect of phenotypic error on power and sample size calculations for case-control genetic association studies between a marker locus and a disease phenotype [5].

Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.