Abstract

Motivation: Genome-wide association studies (GWAS) have identified many loci implicated in disease susceptibility. Integration of GWAS summary statistics (P-values) and functional genomic datasets should help to elucidate mechanisms.Results: We extended a non-parametric SNP set enrichment method to test for enrichment of GWAS signals in functionally defined loci to a situation where only GWAS P-values are available. The approach is implemented in VSEAMS, a freely available software pipeline. We use VSEAMS to identify enrichment of type 1 diabetes (T1D) GWAS associations near genes that are targets for the transcription factors IKZF3, BATF and ESRRA. IKZF3 lies in a known T1D susceptibility region, while BATF and ESRRA overlap other immune disease susceptibility regions, validating our approach and suggesting novel avenues of research for T1D.Availability and implementation: VSEAMS is available for download (http://github.com/ollyburren/vseams).Contact: chris.wallace@cimr.cam.ac.ukSupplementary information: Supplementary data are available at Bioinformatics online.

Highlights

  • Genome-wide association studies have been successful in identifying loci associated with many phenotypes (Welter et al, 2014), and summary statistics in the form of a list of single SNP p-values for each marker tested, are increasingly becoming available in the public domain (Burren et al, 2011; Okada et al, 2014)

  • VSEAMS allows for stratified analysis of multiple Genome-wide association studies (GWAS), for example, individual components of a meta analysis study, using van Elteren’s method to calculate a combined Z-score, we show below that summary statistics from a meta analysis of multiple GWAS can be used directly

  • We found that Z scores calculated by our approximate method showed a close fit to their theoretical distribution. Taken together these results indicate that VSEAMS is a suitable replacement where raw genotyping data are not available and is applicable in the case of a meta-analysis which may include both imputation and different genotyping platforms

Read more

Summary

Introduction

Genome-wide association studies have been successful in identifying loci associated with many phenotypes (Welter et al, 2014), and summary statistics in the form of a list of single SNP p-values for each marker tested, are increasingly becoming available in the public domain (Burren et al, 2011; Okada et al, 2014). 50 susceptibility loci are currently described for type 1 diabetes (http://immunobase.org accessed 15/03/2014) but the index SNP within only 12 regions exist as or are in strong linkage disequilibrium (LD) with a non-synonymous coding SNP This finding agrees with previous research (Schaub et al, 2012; lari et al, 2012), and indicates a central role for gene regulatory. One approach is to modify non-parametric approaches developed for microarray pathway analysis (Subramanian et al, 2005) for use with GWAS study datasets (Wang et al, 2007) These approaches partner SNPs to genes based on public annotations and test for differences in evidence of association between two sets of genes, correcting for inter-SNP correlation due to LD. The permutation based approaches usually employed to adjust for correlation are computationally expensive

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.