Abstract

Genome-wide association studies often emphasize single-nucleotide polymorphisms with the smallest p-values with less attention given to single-nucleotide polymorphisms not ranked near the top. We suggest that gene pathways contain valuable information that can enable identification of additional associations. We used gene set information to identify disease-related pathways using three methods: gene set enrichment analysis (GSEA), empirical enrichment p-values, and Ingenuity pathway analysis (IPA). Association tests were performed for common single-nucleotide polymorphisms and aggregated rare variants with traits Q1 and Q4. These pathway methods were evaluated by type I error, power, and the ranking of the VEGF pathway, the gene set used in the simulation model. GSEA and IPA had high power for detecting the VEGF pathway for trait Q1 (91.2% and 93%, respectively). These two methods were conservative with deflated type I errors (0.0083 and 0.0072, respectively). The VEGF pathway ranked 1 or 2 in 123 of 200 replicates using IPA and ranked among the top 5 in 114 of 200 replicates for GSEA. The empirical enrichment method had lower power and higher type I error. Thus pathway analysis approaches may be useful in identifying biological pathways that influence disease outcomes.

Highlights

  • Genome-wide association studies (GWAS) have had successes in identifying novel genes related to diseases

  • The data analyzed in this study were from a mini-exome scan that used real sequence data for 3,205 genes donated by the 1000 Genomes Project; the data were made available by Genetic Analysis Workshop 17 (GAW17) [6]

  • Rather than focusing on individual single-nucleotide polymorphisms (SNPs) or genes, we considered pathways containing genes sets that together may improve power to identify disease-related candidate genes or pathways

Read more

Summary

Introduction

Genome-wide association studies (GWAS) have had successes in identifying novel genes related to diseases. In these studies the focus is often placed on the most significant single-nucleotide polymorphisms (SNPs) that pass a stringent genome-wide significance threshold. The variability explained by genome-wide significant SNPs is often substantially less than the proportion of heritability estimated for the disease [1]. Variants that confer small disease risks are more likely to be missed among the hundreds of thousands of SNPs that are tested. Additional methods that exploit genetic information beyond single SNP association testing of common variants are needed. Rather than focusing on individual SNPs or genes, we consider gene sets that may improve power to identify disease-related candidate genes or pathways. The gene set enrichment analysis (GSEA) developed by Subramanian et al [2] was one of the first approaches developed to identify gene sets that are associated with phenotypes

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.