Abstract

The identification of disease-causing genes is a fundamental challenge in human health and of great importance in improving medical care, and provides a better understanding of gene functions. Recent computational approaches based on the interactions among human proteins and disease similarities have shown their power in tackling the issue. In this paper, a novel systematic and global method that integrates two heterogeneous networks for prioritizing candidate disease-causing genes is provided, based on the observation that genes causing the same or similar diseases tend to lie close to one another in a network of protein-protein interactions. In this method, the association score function between a query disease and a candidate gene is defined as the weighted sum of all the association scores between similar diseases and neighbouring genes. Moreover, the topological correlation of these two heterogeneous networks can be incorporated into the definition of the score function, and finally an iterative algorithm is designed for this issue. This method was tested with 10-fold cross-validation on all 1,126 diseases that have at least a known causal gene, and it ranked the correct gene as one of the top ten in 622 of all the 1,428 cases, significantly outperforming a state-of-the-art method called PRINCE. The results brought about by this method were applied to study three multi-factorial disorders: breast cancer, Alzheimer disease and diabetes mellitus type 2, and some suggestions of novel causal genes and candidate disease-causing subnetworks were provided for further investigation.

Highlights

  • Computational investigation of gene functions in the context of complex biological systems is promoted greatly by the accumulation of high-throughput data, of which protein-protein interaction data have been exploited to identify disease-causing genes, based on the observation that genes implicated in a specific or similar diseases tend to be located in a specific neighbourhood in the protein-protein interaction network [1,2,3]

  • According to the declaration in Cipher [6], the disease-gene associations are from the Online Mendelian Inheritance in Man (OMIM) knowledge database [12], and the proteinprotein interactions from the Human Protein Reference Database(HPRD) [14]

  • There are three parameters to be tuned: (1) the threshold parameter b which is used to filter out the disease similarity and the prior association score smaller than it, and is set as ‘‘0.5’’; (2) a, which controls the relative importance of the prior information in the computation of the disease-gene association scores

Read more

Summary

Introduction

Computational investigation of gene functions in the context of complex biological systems is promoted greatly by the accumulation of high-throughput data, of which protein-protein interaction data have been exploited to identify disease-causing genes, based on the observation that genes implicated in a specific or similar diseases tend to be located in a specific neighbourhood in the protein-protein interaction network [1,2,3]. Wu et al [6] proposed a computational framework that integrates human protein-protein interactions, phenotype similarities, and known gene-phenotype associations to capture the complex relationships between disease phenotypes and genotypes. They defined the global concordance score between the phenotype similarity profile and the gene closeness profile as the disease-gene association score. A tool named CIPHER was developed to predict and prioritize candidate diseasecausing genes In their follow-up work [7], they studied the consistency between the disease phenotypic overlap and genetic overlap via the network alignment technique systematically and quantitatively. In PRINCE, for a given disease the prioritization is done iteratively over the entire protein interaction network, and each protein propagates the information received in the previous iteration to its neighbours

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.