Abstract

BackgroundPhenotypically similar diseases have been found to be caused by functionally related genes, suggesting a modular organization of the genetic landscape of human diseases that mirrors the modularity observed in biological interaction networks. Protein complexes, as molecular machines that integrate multiple gene products to perform biological functions, express the underlying modular organization of protein-protein interaction networks. As such, protein complexes can be useful for interrogating the networks of phenome and interactome to elucidate gene-phenotype associations of diseases.Methodology/Principal FindingsWe proposed a technique called RWPCN (Random Walker on Protein Complex Network) for predicting and prioritizing disease genes. The basis of RWPCN is a protein complex network constructed using existing human protein complexes and protein interaction network. To prioritize candidate disease genes for the query disease phenotypes, we compute the associations between the protein complexes and the query phenotypes in their respective protein complex and phenotype networks. We tested RWPCN on predicting gene-phenotype associations using leave-one-out cross-validation; our method was observed to outperform existing approaches. We also applied RWPCN to predict novel disease genes for two representative diseases, namely, Breast Cancer and Diabetes.Conclusions/SignificanceGuilt-by-association prediction and prioritization of disease genes can be enhanced by fully exploiting the underlying modular organizations of both the disease phenome and the protein interactome. Our RWPCN uses a novel protein complex network as a basis for interrogating the human phenome-interactome network. As the protein complex network can capture the underlying modularity in the biological interaction networks better than simple protein interaction networks, RWPCN was found to be able to detect and prioritize disease genes better than traditional approaches that used only protein-phenotype associations.

Highlights

  • Uncovering the associations between the genetic diseases and their causative genes is a fundamental objective of human genetics [1]

  • A common approach is to measure the similarities between the candidate genes and the known disease causative genes based on biological evidences of these genes such as protein sequence information [5], gene expression profiles [6], and even literature descriptions [7]

  • Whole genome evaluation proposed by [15] basically ranks all the genes to scan for disease genes, e.g. we can consider all Human Protein Reference database (HPRD) genes which do not link to the query phenotype and check how many known test disease genes are still ranked as top 1 in the cross-validation test

Read more

Summary

Introduction

Uncovering the associations between the genetic diseases and their causative genes is a fundamental objective of human genetics [1]. There has been an increase in the number of genes confirmed as causative genes to diseases [4] Such information can be exploited by computational methods to predict or prioritize new disease-gene associations. Candidate genes that share high similarities with the known disease causative genes can be ranked as the putative disease genes to be validated by biologists or clinicians. These approaches are limited by the quality and completeness of the biological evidences. Similar diseases have been found to be caused by functionally related genes, suggesting a modular organization of the genetic landscape of human diseases that mirrors the modularity observed in biological interaction networks. Protein complexes can be useful for interrogating the networks of phenome and interactome to elucidate gene-phenotype associations of diseases

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call