Abstract

One of the fundamental goals of genetics is to understand gene functions and their associated phenotypes. To achieve this goal, in this study we developed a computational algorithm that uses orthology and protein-protein interaction information to infer gene-phenotype associations for multiple species. Furthermore, we developed a web server that provides genome-wide phenotype inference for six species: fly, human, mouse, worm, yeast, and zebrafish. We evaluated our inference method by comparing the inferred results with known gene-phenotype associations. The high Area Under the Curve values suggest a significant performance of our method. By applying our method to two human representative diseases, Type 2 Diabetes and Breast Cancer, we demonstrated that our method is able to identify related Gene Ontology terms and Kyoto Encyclopedia of Genes and Genomes pathways. The web server can be used to infer functions and putative phenotypes of a gene along with the candidate genes of a phenotype, and thus aids in disease candidate gene discovery. Our web server is available at http://jjwanglab.org/PhenoPPIOrth.

Highlights

  • Phenotypes denote the observable physical or biological traits of an organism

  • Determination of the parameterl According to our scoring function, for a given gene, l is the only parameter that could be pre-set to affect the ranking list of candidate phenotypes, as the PPI and orthology data have already been determined in the database

  • If a phenotype below the cutoff agrees with the known phenotype, it is regarded as a false negative, otherwise a true negative

Read more

Summary

Introduction

Phenotypes denote the observable physical or biological traits of an organism. Understanding the relations between genes and gene functions (or related phenotypes) is one of the main objectives of genetics in the post-genome era [1] [2] [3]. The number of genes with identified phenotypes has not been able to reach the genomic scale yet, due to some technical challenges such as the multi-functionality of genes and heterogeneity of diseases [4,5,6]. At this moment, various types of proteomic and/or genomic data (such as protein-protein interaction (PPI) data [6,7,8,9,10,11,12], sequence data [13,14] and function annotations [15,16,17,18,19]) have been used to identify gene-phenotype associations. Researchers employed machine learning approaches and function annotations to construct models[16] [24] [14]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call