Abstract

BackgroundGene Ontology (GO) is a collaborative project that maintains and develops controlled vocabulary (or terms) to describe the molecular function, biological roles and cellular location of gene products in a hierarchical ontology. GO also provides GO annotations that associate genes with GO terms. GO consortium independently and collaboratively annotate terms to gene products, mainly from model organisms (or species) they are interested in. Due to experiment ethics, research interests of biologists and resources limitations, homologous genes from different species currently are annotated with different terms. These differences can be more attributed to incomplete annotations of genes than to functional difference between them.ResultsSemantic similarity between genes is derived from GO hierarchy and annotations of genes. It is positively correlated with the similarity derived from various types of biological data and has been applied to predict gene function. In this paper, we investigate whether it is possible to replenish annotations of incompletely annotated genes by using semantic similarity between genes from two species with homology. For this investigation, we utilize three representative semantic similarity metrics to compute similarity between genes from two species. Next, we determine the k nearest neighborhood genes from the two species based on the chosen metric and then use terms annotated to k neighbors of a gene to replenish annotations of that gene. We perform experiments on archived (from Jan-2014 to Jan-2016) GO annotations of four species (Human, Mouse, Danio rerio and Arabidopsis thaliana) to assess the contribution of semantic similarity between genes from different species. The experimental results demonstrate that: (1) semantic similarity between genes from homologous species contributes much more on the improved accuracy (by 53.22%) than genes from single species alone, and genes from two species with low homology; (2) GO annotations of genes from homologous species are complementary to each other.ConclusionsOur study shows that semantic similarity based interspecies gene function annotation from homologous species is more prominent than traditional intraspecies approaches. This work can promote more research on semantic similarity based function prediction across species.Electronic supplementary materialThe online version of this article (doi:10.1186/s12918-016-0361-5) contains supplementary material, which is available to authorized users.

Highlights

  • Gene Ontology (GO) is a collaborative project that maintains and develops controlled vocabulary to describe the molecular function, biological roles and cellular location of gene products in a hierarchical ontology

  • Datasets and experimental setup To comparatively study the contribution of integrating semantic similarity between genes and GO annotations of genes from two species, we conduct experiment on annotations of genes from Human and Mouse

  • We downloaded recent GO file [47] that contains hierarchical relationships between GO terms. These terms are organized in three sub-ontology, namely biological process (BP), cellular component (CC) and molecular functions (MF), the terms in each ontology form a direct acyclic graph (DAG)

Read more

Summary

Introduction

Gene Ontology (GO) is a collaborative project that maintains and develops controlled vocabulary (or terms) to describe the molecular function, biological roles and cellular location of gene products in a hierarchical ontology. Research interests of biologists and resources limitations, homologous genes from different species currently are annotated with different terms These differences can be more attributed to incomplete annotations of genes than to functional difference between them. Gene products, both proteins and RNAs, play crucial functions in many if not all, life processes, such as metabolism, signal transduction and hormonal regulation. Both proteins and RNAs, play crucial functions in many if not all, life processes, such as metabolism, signal transduction and hormonal regulation Annotating their biological functions is a crucial link in the development of drugs, vaccines , bio-. This rule is recognized as true path rule [19, 20]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call