Abstract

Determining the most promising single-nucleotide polymorphisms (SNPs) presents a challenge in genome-wide association studies, when hundreds of thousands of association tests are conducted. The power to detect genetic effects is dependent on minor allele frequency (MAF), and genome-wide association studies SNP arrays include SNPs with a wide distribution of MAFs. Therefore, it is critical to understand MAF's effect on the false positive rate.Data from the Framingham Heart Study simulated data (Problem 3, with answers) was used to examine the effects of varying MAFs on the likelihood of false positives. Replication set 1 was used to generate 1 million permutations of case/control status in unrelated individuals. Logistic regression was used to test for the association between each SNP and myocardial infarction using an additive model. We report the number of "significant" tests by MAF at α = 10-4, 10-5, and 10-6.Common SNPs exhibited fewer false positives than expected. At α = 10-4, SNPs with MAF 25% and 50% resulted in 69.2 [95%CI: 62.8-75.6] and 70.8 [95%CI: 61.3-80.4] false positives, respectively, compared to 100 expected. Rare SNPs exhibited more variability but did not show more false-positive results than expected by chance. However, at α = 10-4, MAF = 5% exhibited significantly more false positives (105.5 [95%CI: 81-130.1]) than MAF = 25% and 50%. Similar results were seen at the other alpha values.These results suggest that removal of low MAF SNPs from analysis due to concerns about inflated false-positive results may not be appropriate.

Highlights

  • Correct identification of the most promising singlenucleotide polymorphisms (SNPs) for follow-up is one of the greatest challenges of conducting genome-wide association studies (GWAS)

  • While low minor allele frequency (MAF) SNPs (1% or 5%) did not have elevated false positives compared to chance, the 5% MAF SNPs had significantly more false positives than the more common (25 and 50%) MAF SNPs at the 10-4 (p < 0.03) and 10-5 (p < 0.05) thresholds

  • The current study provides evidence that false-positive rate is influenced by MAF

Read more

Summary

Introduction

Correct identification of the most promising singlenucleotide polymorphisms (SNPs) for follow-up is one of the greatest challenges of conducting genome-wide association studies (GWAS). The conservative Bonferroni correction to adjust the critical significance threshold lowers the threshold of a to account for the number of tests conducted; any SNP with a p-value below that adjusted threshold will be considered significant. False-discovery rate (FDR) procedures explicitly rank p-values before selecting the SNPs below the FDR threshold for further consideration. Genome-wide permutation testing methods involve randomly permuting case and control status and examining the distribution of the best resulting test statistic, creating an empirical distribution of extreme test statistics against which to compare observed results. With each of these methods if multiple SNPs reach this threshold, the investigator must choose which SNPs to follow for replication

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.