Abstract

The last decade has seen rapid improvements in high-throughput single nucleotide polymorphism (SNP) genotyping technologies that have consequently made genome-wide association studies (GWAS) possible. With tens to hundreds of thousands of SNP markers being tested simultaneously in GWAS, it is imperative to appropriately pre-process, or filter out, those SNPs that may lead to false associations. This paper explores the relationships between various SNP genotype and phenotype attributes and their effects on false associations. We show that (i) uniformly distributed ordinal data as well as binary data are more easily influenced, though not necessarily negatively, by differences in various SNP attributes compared with normally distributed data; (ii) filtering SNPs on minor allele frequency (MAF) and extent of Hardy–Weinberg equilibrium (HWE) deviation has little effect on the overall false positive rate; (iii) in some cases, filtering on MAF only serves to exclude SNPs from the analysis without reduction of the overall proportion of false associations; and (iv) HWE, MAF and heterozygosity are all dependent on minor genotype frequency, a newly proposed measure for genotype integrity.

Highlights

  • Genome-wide association studies (GWAS) using single nucleotide polymorphism (SNP) markers have become increasingly popular for dissecting the genetics of complex traits

  • The necessity to include minor genotype frequency (MGF) in addition to minor allele frequency (MAF) is justified because SNPs with low MGF do not always imply low MAF (Fig. 1)

  • The inclusion of MGF in addition to the test of Hardy–Weinberg equilibrium (HWE) is because SNPs with low MGF do not necessarily deviate from HWE, as in the case when the minor genotype is one of the homozygotes

Read more

Summary

Introduction

Genome-wide association studies (GWAS) using single nucleotide polymorphism (SNP) markers have become increasingly popular for dissecting the genetics of complex traits (reviewed in Hirschhorn et al 2002 and McCarthy et al 2008). A filtering process, defined by a set of rules, is generally applied to remove markers from an analysis The deduction of these rules may be arbitrary Easton et al 2007; Sladek et al 2007) or empirical (The Wellcome Trust Case Control Consortium 2007), and this is typically based on various measures or attributes calculated to reflect the markersÕ integrity and usefulness. These attributes may include genotyping call-rate, missing data, monomorphism, loss of heterozygosity (LOH), observed heterozygosity (Hobs), minor allele frequency (MAF), and extent of Hardy–Weinberg equilibrium (HWE) deviations. We propose minor genotype frequency (MGF) as a filtering criterion and explore its value as a quality control measure

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call