Abstract

Analyzing sets of genes in genome-wide association studies is a relatively new approach that aims to capitalize on biological knowledge about the interactions of genes in biological pathways. This approach, called pathway analysis or gene set analysis, has not yet been applied to the analysis of rare variants. Applying pathway analysis to rare variants offers two competing approaches. In the first approach rare variant statistics are used to generate p-values for each gene (e.g., combined multivariate collapsing [CMC] or weighted-sum [WS]) and the gene-level p-values are combined using standard pathway analysis methods (e.g., gene set enrichment analysis or Fisher’s combined probability method). In the second approach, rare variant methods (e.g., CMC and WS) are applied directly to sets of single-nucleotide polymorphisms (SNPs) representing all SNPs within genes in a pathway. In this paper we use simulated phenotype and real next-generation sequencing data from Genetic Analysis Workshop 17 to analyze sets of rare variants using these two competing approaches. The initial results suggest substantial differences in the methods, with Fisher’s combined probability method and the direct application of the WS method yielding the best power. Evidence suggests that the WS method works well in most situations, although Fisher’s method was more likely to be optimal when the number of causal SNPs in the set was low but the risk of the causal SNPs was high.

Highlights

  • Analysis of single-nucleotide polymorphism (SNP) microarray data in genome-wide association studies has traditionally been agnostic because prior biological knowledge about the genome has not been taken into account

  • Â In pathway analysis, single-nucleotide polymorphisms (SNPs) are associated with genes, and genes are placed into sets

  • As noted, when spuriously associated genes are removed from sets, type I error rates are well controlled by all methods, suggesting that if spurious associations are better handled by rare variant methods, type I errors should be well controlled

Read more

Summary

Introduction

Analysis of single-nucleotide polymorphism (SNP) microarray data in genome-wide association studies has traditionally been agnostic because prior biological knowledge about the genome has not been taken into account. Madsen and Browning [6] suggested the possibility of combining rare variant information across a set (pathway) of genes, and this method has been recently applied to common variants [10], but the approach has not yet been implemented in practice on rare variant data. This approach has not been compared to traditional methods of pathway analysis, which combine information at the gene level into a gene statistic before combining over the pathway

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.