Abstract

Genome-wide association study (GWAS) is a promising approach for identifying common genetic variants of the diseases on the basis of millions of single nucleotide polymorphisms (SNPs). In order to avoid low power caused by overmuch correction for multiple comparisons in single locus association study, some methods have been proposed by grouping SNPs together into a SNP set based on genomic features, then testing the joint effect of the SNP set. We compare the performances of principal component analysis (PCA), supervised principal component analysis (SPCA), kernel principal component analysis (KPCA), and sliced inverse regression (SIR). Simulated SNP sets are generated under scenarios of 0, 1 and ≥2 causal SNPs model. Our simulation results show that all of these methods can control the type I error at the nominal significance level. SPCA is always more powerful than the other methods at different settings of linkage disequilibrium structures and minor allele frequency of the simulated datasets. We also apply these four methods to a real GWAS of non-small cell lung cancer (NSCLC) in Han Chinese population

Highlights

  • It is widely believed that genetic variants play an important role in the etiology of common diseases risk or quantitative traits

  • We compare the performances of principal component analysis (PCA), kernel principal component analysis (KPCA), supervised principal component analysis (SPCA) and sliced inverse regression (SIR) using simulated datasets

  • Results from the simulations on scenarios A4–A6 are presented by Figure 1, which shows that SPCA has the best power

Read more

Summary

Introduction

It is widely believed that genetic variants play an important role in the etiology of common diseases risk or quantitative traits. In the last few years, we have witnessed the development of GWAS which have become a popular approach for identifying related genetic variation of complex diseases [1,2]. Single-locus association tests (SLAT) is usually run to identify causal or associated SNPs of diseases. Such a SNP-by-SNP association study requests a multiple testing adjustment procedure to ensure overall appropriate type I error rate, such as Bonferroni correction and false discovery rates. Studies suggest that complex diseases are often caused by weak effect SNPs (relative risk RR, = 1.5), which results in poor statistical power after multiple correction [4,5]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call