Abstract

Comprehensive characterization of a gene's impact on phenotypes requires knowledge of the context of the gene. To address this issue we introduce a systematic data integration method Candidate Genes and SNPs (CANGES) that links SNP and linkage disequilibrium data to pathway- and protein-protein interaction information. It can be used as a knowledge discovery tool for the search of disease associated causative variants from genome-wide studies as well as to generate new hypotheses on synergistically functioning genes. We demonstrate the utility of CANGES by integrating pathway and protein-protein interaction data to identify putative functional variants for (i) the p53 gene and (ii) three glioblastoma multiforme (GBM) associated risk genes. For the GBM case, we further integrate the CANGES results with clinical and genome-wide data for 209 GBM patients and identify genes having effects on GBM patient survival. Our results show that selecting a focused set of genes can result in information beyond the traditional genome-wide association approaches. Taken together, holistic approach to identify possible interacting genes and SNPs with CANGES provides a means to rapidly identify networks for any set of genes and generate novel hypotheses. CANGES is available in http://csbi.ltdk.helsinki.fi/CANGES/

Highlights

  • Cellular functions are regulated by complex and multivariate molecular regulatory networks

  • single nucleotide polymorphism (SNP)-arrays have been powerful in genome-wide association (GWA) studies and have resulted in several genetic loci or genes that are associated with disease risk or poor prognosis [3,4,5]

  • We introduce a data integration workflow CANGES (Candidate Genes and SNPs) to rapidly identify focal genes, i.e., genes that code for proteins which interact or belong to the same molecular network with a protein coded by a candidate gene

Read more

Summary

Introduction

Cellular functions are regulated by complex and multivariate molecular regulatory networks. SNP-arrays have been powerful in genome-wide association (GWA) studies and have resulted in several genetic loci or genes that are associated with disease risk or poor prognosis [3,4,5]. As such candidate genes typically affect cellular functions by altering signaling in regulatory networks, it is crucial to comprehensively characterize these regulatory networks. We introduce a data integration workflow CANGES (Candidate Genes and SNPs) to rapidly identify focal genes, i.e., genes that code for proteins which interact or belong to the same molecular network with a protein coded by a candidate gene. Central SNPs may affect protein function and cause gene-gene and SNP-SNP interactions in the regulatory network leading to increased risk or survival effects

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call