Abstract

The identification of disease-causing genes is critical for mechanistic understanding of disease etiology and clinical manipulation in disease prevention and treatment. Yet the existing approaches in tackling this question are inadequate in accuracy and efficiency, demanding computational methods with higher identification power. Here, we proposed a new method called DGHNE to identify disease-causing genes through a heterogeneous biomedical network empowered by network enhancement. First, a disease-disease association network was constructed by the cosine similarity scores between phenotype annotation vectors of diseases, and a new heterogeneous biomedical network was constructed by using disease-gene associations to connect the disease-disease network and gene-gene network. Then, the heterogeneous biomedical network was further enhanced by using network embedding based on the Gaussian random projection. Finally, network propagation was used to identify candidate genes in the enhanced network. We applied DGHNE together with five other methods into the most updated disease-gene association database termed DisGeNet. Compared with all other methods, DGHNE displayed the highest area under the receiver operating characteristic curve and the precision-recall curve, as well as the highest precision and recall, in both the global 5-fold cross-validation and predicting new disease-gene associations. We further performed DGHNE in identifying the candidate causal genes of Parkinson's disease and diabetes mellitus, and the genes connecting hyperglycemia and diabetes mellitus. In all cases, the predicted causing genes were enriched in disease-associated gene ontology terms and Kyoto Encyclopedia of Genes and Genomes pathways, and the gene-disease associations were highly evidenced by independent experimental studies.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call