Abstract

BackgroundGenome-wide association studies (GWAS) have generated a wealth of valuable genotyping data for complex diseases/traits. A large proportion of these data are embedded with many weakly associated markers that have been missed in traditional single marker analyses, but they may provide valuable insights in dissecting the genetic components of diseases. Gene set analysis (GSA) augmented by protein-protein interaction network data provides a promising way to examine GWAS data by analyzing the combined effects of multiple genes/markers, each of which may have only individually weak to moderate association effects. A critical issue in GSA of GWAS data is the definition of gene-wise P values based on multiple SNPs mapped to a gene.ResultsIn this study, we proposed an alternative restricted search approach based on our previously developed dense module search algorithm, and we demonstrated it in the CATIE GWAS dataset for schizophrenia. Specifically, we explored three ways of computing gene-wise P values and examined their effects on the resultant module genes. These methods calculate gene-wise P values based on all the SNPs, the top ranked SNPs, or the most significant SNP among all the SNPs mapped to a gene. We applied the restricted search approach and identified a module gene set for each of the gene-wise P value data set. In our evaluation using an independent method, ALIGATOR, we showed that although each of these input datasets generated a unique set of module genes, all of them were significant in the GWAS dataset. Further functional enrichment analysis of these module genes showed that at the pathway level, they were all consistently related to neuro- and immune-related pathways. Finally, we compared our method with a previously reported method.ConclusionOur results showed that the approaches to computing gene-wise P values in GWAS data are critical in GSA. This work is useful for evaluating key factors in GSA of GWAS data.

Highlights

  • Genome-wide association studies (GWAS) have generated a wealth of valuable genotyping data for complex diseases/traits

  • This new strategy could greatly reduce the computational intensity problem. We demonstrated this method in a GWAS dataset for schizophrenia and explored three different ways to define genewise P values

  • We explored three options in VEGAS to compute genewise P values based on sets of single nucleotide polymorphism (SNP): (1) using all the SNPs mapped to a gene; (2) using the top 10% SNPs based on SNP-level P values (“VEGAS-top”); and (3) using the most significant SNP, i.e., the SNP with the smallest P value (“minP”)

Read more

Summary

Introduction

Genome-wide association studies (GWAS) have generated a wealth of valuable genotyping data for complex diseases/traits. Gene set analysis (GSA) augmented by protein-protein interaction network data provides a promising way to examine GWAS data by analyzing the combined effects of multiple genes/markers, each of which may have only individually weak to moderate association effects. Gene set analysis (GSA) of GWAS data provides an alternative approach of assessing the joint effects of multiple genes [2], regardless of whether they are individually significant or not. Complex diseases are likely caused by multiple genes and markers, each of which may only contribute weak to moderate effect. Given that these markers are biologically or functionally correlated, GSA would increase the power to detect them in a typical GWAS dataset [2]. More comprehensive methods are in need to incorporate the GWAS data with the PPI data to help construct, prioritize, and evaluate subnetworks for complex diseases

Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.