Abstract

A network-based approach has proven useful for the identification of novel genes associated with complex phenotypes, including human diseases. Because network-based gene prioritization algorithms are based on propagating information of known phenotype-associated genes through networks, the pathway structure of each phenotype might significantly affect the effectiveness of algorithms. We systematically compared two popular network algorithms with distinct mechanisms – direct neighborhood which propagates information to only direct network neighbors, and network diffusion which diffuses information throughout the entire network – in prioritization of genes for worm and human phenotypes. Previous studies reported that network diffusion generally outperforms direct neighborhood for human diseases. Although prioritization power is generally measured for all ranked genes, only the top candidates are significant for subsequent functional analysis. We found that high prioritizing power of a network algorithm for all genes cannot guarantee successful prioritization of top ranked candidates for a given phenotype. Indeed, the majority of the phenotypes that were more efficiently prioritized by network diffusion showed higher prioritizing power for top candidates by direct neighborhood. We also found that connectivity among pathway genes for each phenotype largely determines which network algorithm is more effective, suggesting that the network algorithm used for each phenotype should be chosen with consideration of pathway gene connectivity.

Highlights

  • Genes that are associated with the same phenotypes tend to be co-functional

  • Our analysis showed that high prioritizing power for all genes does not guarantee successful prioritization for top candidate genes, and that the effectiveness of the two network algorithms for entire ranks and early retrieval are largely affected by pathway gene connectivity in the network

  • Effectiveness of network algorithms differs among the three classes of phenotypes In the analysis described above, phenotype genes were prioritized in a leave-one-out analysis setting, in which the score of each phenotype gene is determined by the sum of the edge weights to all other phenotype genes

Read more

Summary

Introduction

Genes that are associated with the same phenotypes tend to be co-functional. This functional association between genes can be harnessed to identify novel genes that might be associated with complex phenotypes, for example human diseases [1,2,3]. Network-based gene prioritization for phenotypes involves four factors: i) gene networks, ii) known genes for a phenotype of interest, iii) algorithms to propagate information of known phenotype genes through the network, and iv) metrics to assess prioritization models. Over the past several years, many genome-scale gene networks for various organisms, including humans, have become publicly available and have been used for the prediction of novel disease genes [4,5,6,7]. The number of phenotype annotations for genes has PLOS ONE | DOI:10.1371/journal.pone.0130589. The number of phenotype annotations for genes has PLOS ONE | DOI:10.1371/journal.pone.0130589 June 19, 2015

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.