How to link call rate and p‐values for Hardy–Weinberg equilibrium as measures of genome‐wide SNP data quality

Helmut Finner,Klaus Strassburger,Thorsten Dickhaus,H.-Erich Wichmann,Thomas Illig,Iris M Heid,Christian Herder,Peter Lichtner,Thomas Meitinger,Christian Gieger,Guido Giani,Wolfgang Rathmann

doi:10.1002/sim.4004

Abstract

We study the link between two quality measures of SNP (single nucleotide polymorphism) data in genome-wide association (GWA) studies, that is, per SNP call rates (CR) and p-values for testing Hardy-Weinberg equilibrium (HWE). The aim is to improve these measures by applying methods based on realized randomized p-values, the false discovery rate and estimates for the proportion of false hypotheses. While exact non-randomized conditional p-values for testing HWE cannot be recommended for estimating the proportion of false hypotheses, their realized randomized counterparts should be used. P-values corresponding to the asymptotic unconditional chi-square test lead to reasonable estimates only if SNPs with low minor allele frequency are excluded. We provide an algorithm to compute the probability that SNPs violate HWE given the observed CR, which yields an improved measure of data quality. The proposed methods are applied to SNP data from the KORA (Cooperative Health Research in the Region of Augsburg, Southern Germany) 500 K project, a GWA study in a population-based sample genotyped by Affymetrix GeneChip 500 K arrays using the calling algorithm BRLMM 1.4.0. We show that all SNPs with CR = 100 per cent are nearly in perfect HWE which militates in favor of the population to meet the conditions required for HWE at least for these SNPs. Moreover, we show that the proportion of SNPs not being in HWE increases with decreasing CR. We conclude that using a single threshold for judging HWE p-values without taking the CR into account is problematic. Instead we recommend a stratified analysis with respect to CR.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

How to link call rate and p‐values for Hardy–Weinberg equilibrium as measures of genome‐wide SNP data quality

Abstract

Talk to us

Similar Papers

More From: Statistics in Medicine

Lead the way for us

Journal: Statistics in Medicine	Publication Date: Jul 19, 2010
Citations: 14

Similar Papers

Short communication: Relationship of call rate and accuracy of single nucleotide polymorphism genotypes in dairy cattle
T.A Cooper ... P.M Vanraden
Journal of Dairy Science | VOL. 96
T.A Cooper, et. al.T.A Cooper ... P.M Vanraden
14 Mar 2013
Journal of Dairy Science | VOL. 96

Exploration of the Genetic Basis of GVHD by Genetic Association Studies
Seishi Ogawa ... Sasazuki Takehiko
Biology of Blood and Marrow Transplantation | VOL. 15
Seishi Ogawa, et. al.Seishi Ogawa ... Sasazuki Takehiko
01 Jan 2009
Biology of Blood and Marrow Transplantation | VOL. 15

Polymorphisms in Interleukin-15 Gene on Chromosome 4q31.2 Are Associated with Psoriasis Vulgaris in Chinese Population
Xue-Jun Zhang ... Jian-Jun Liu
Journal of Investigative Dermatology | VOL. 127
Xue-Jun Zhang, et. al.Xue-Jun Zhang ... Jian-Jun Liu
01 Nov 2007
Polymorphisms in Interleukin-15 Gene on Chromosome 4q31.2 Are Associated with Psoriasis Vulgaris in Chinese Population
Xue-Jun Zhang ... Jian-Jun Liu

Concordance rate in cattle and sheep between genotypes differing in Illumina GenCall quality score.
D P Berry ... A C O'Brien
Animal genetics | VOL. 52
D P Berry, et. al.D P Berry ... A C O'Brien
01 Feb 2021
Animal genetics | VOL. 52

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

How to link call rate and p‐values for Hardy–Weinberg equilibrium as measures of genome‐wide SNP data quality

Abstract

Talk to us

Similar Papers

More From: Statistics in Medicine