Abstract

ObjectiveIn the context of “network medicine”, gene prioritization methods represent one of the main tools to discover candidate disease genes by exploiting the large amount of data covering different types of functional relationships between genes. Several works proposed to integrate multiple sources of data to improve disease gene prioritization, but to our knowledge no systematic studies focused on the quantitative evaluation of the impact of network integration on gene prioritization. In this paper, we aim at providing an extensive analysis of gene-disease associations not limited to genetic disorders, and a systematic comparison of different network integration methods for gene prioritization.Materials and methodsWe collected nine different functional networks representing different functional relationships between genes, and we combined them through both unweighted and weighted network integration methods. We then prioritized genes with respect to each of the considered 708 medical subject headings (MeSH) diseases by applying classical guilt-by-association, random walk and random walk with restart algorithms, and the recently proposed kernelized score functions.ResultsThe results obtained with classical random walk algorithms and the best single network achieved an average area under the curve (AUC) across the 708 MeSH diseases of about 0.82, while kernelized score functions and network integration boosted the average AUC to about 0.89. Weighted integration, by exploiting the different “informativeness” embedded in different functional networks, outperforms unweighted integration at 0.01 significance level, according to the Wilcoxon signed rank sum test. For each MeSH disease we provide the top-ranked unannotated candidate genes, available for further bio-medical investigation.ConclusionsNetwork integration is necessary to boost the performances of gene prioritization methods. Moreover the methods based on kernelized score functions can further enhance disease gene ranking results, by adopting both local and global learning strategies, able to exploit the overall topology of the network.

Highlights

  • The raising awareness that a disease is rarely a consequence of an abnormality on a single gene, but it is usually the result of complex interactions and perturbations involving large sets of genes and their relationships with several cellular components, lead to development of the “Network medicine”, a network based approach to human disease [1]

  • The results obtained with classical random walk algorithms and the best single network achieved an average area under the curve (AUC) across the 708

  • Our aim consists in providing an analysis of the impact of network integration to gene prioritization, in order to understand whether the combination of multiple networks, constructed from different sources of information, can significantly enhance the performance of gene prioritization methods, and to provide a quantitative assessment of this hypothesized improvement. To this end we programmatically considered relatively simple methods, ranging from unweighted to weighted network integration algorithms, excluding more complex algorithms proposed in the literature, to allows us to perform an extensive analysis involving a large set of diseases, a large set of human genes and a significant subset of the integration methods applied to gene prioritization problems

Read more

Summary

Introduction

The raising awareness that a disease is rarely a consequence of an abnormality on a single gene, but it is usually the result of complex interactions and perturbations involving large sets of genes and their relationships with several cellular components, lead to development of the “Network medicine”, a network based approach to human disease [1]. In this context, gene prioritization methods have progressed quickly with the aim of discovering candidate “disease” genes by exploiting the large amount of available “omics” data covering different types of relationships between genes [2].

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call