Abstract

Protein-protein interactions integrated with disease-gene associations represent important information for revealing protein functions under disease conditions to improve the prevention, diagnosis, and treatment of complex diseases. Although several studies have attempted to identify disease-gene associations, the number of possible disease-gene associations is very small. High-throughput technologies have been established experimentally to identify the association between genes and diseases. However, these techniques are still quite expensive, time consuming, and even difficult to perform. Thus, based on currently available data and knowledge, computational methods have served as alternatives to provide more possible associations to increase our understanding of disease mechanisms. Here, a new network-based algorithm, namely, Disease-Gene Association (DGA), was developed to calculate the association score of a query gene to a new possible set of diseases. First, a large-scale protein interaction network was constructed, and the relationship between two interacting proteins was calculated with regard to the disease relationship. Novel plausible disease-gene pairs were identified and statistically scored by our algorithm using neighboring protein information. The results yielded high performance for disease-gene prediction, with an F-measure of 0.78 and an AUC of 0.86. To identify promising candidates of disease-gene associations, the association coverage of genes and diseases were calculated and used with the association score to perform gene and disease selection. Based on gene selection, we identified promising pairs that exhibited evidence related to several important diseases, e.g., inflammation, lipid metabolism, inborn errors, xanthomatosis, cerebellar ataxia, cognitive deterioration, malignant neoplasms of the skin and malignant tumors of the cervix. Focusing on disease selection, we identified target genes that were important to blistering skin diseases and muscular dystrophy. In summary, our developed algorithm is simple, efficiently identifies disease–gene associations in the protein-protein interaction network and provides additional knowledge regarding disease-gene associations. This method can be generalized to other association studies to further advance biomedical science.

Highlights

  • IntroductionProteins cooperate in various ways to accomplish needed functions

  • In cellular systems, proteins cooperate in various ways to accomplish needed functions

  • To identify more disease-gene association, we developed an algorithm called “DiseaseGene Association (DGA)” based on k nearest neighbors and local network analysis of the large-scale human protein-protein interaction data integrated with disease relationships

Read more

Summary

Introduction

Proteins cooperate in various ways to accomplish needed functions. Several experimental methods have been established to identify disease-gene associations, such as genome-wide association studies (GWAS) [3], RNA interference (RNAi) screens [4], and linkage studies [5] Since these methods are expensive and time consuming, many databases of disease genes have been developed, and computational methods have become an important tool to retrieve and analyze the disease data for a better understanding of disease mechanisms. Among the most commonly used databases of disease genes, Online Mendelian Inheritance in Man (OMIM) [6] and GeneCards [7] collect many manually curated data for the relationship between diseases and genes Such relationships are inferred using data from gene variants [8, 9], biological pathways [10], gene expression data [11], biomedical ontologies [12] or text mining [13]. Many useful gene or protein networks have become widely used, such as the STRING database [21] for curated protein-protein interaction networks

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call