Abstract

BackgroundThe use of Gene Ontology (GO) data in protein analyses have largely contributed to the improved outcomes of these analyses. Several GO semantic similarity measures have been proposed in recent years and provide tools that allow the integration of biological knowledge embedded in the GO structure into different biological analyses. There is a need for a unified tool that provides the scientific community with the opportunity to explore these different GO similarity measure approaches and their biological applications.ResultsWe have developed DaGO-Fun, an online tool available at http://web.cbio.uct.ac.za/ITGOM, which incorporates many different GO similarity measures for exploring, analyzing and comparing GO terms and proteins within the context of GO. It uses GO data and UniProt proteins with their GO annotations as provided by the Gene Ontology Annotation (GOA) project to precompute GO term information content (IC), enabling rapid response to user queries.ConclusionsThe DaGO-Fun online tool presents the advantage of integrating all the relevant IC-based GO similarity measures, including topology- and annotation-based approaches to facilitate effective exploration of these measures, thus enabling users to choose the most relevant approach for their application. Furthermore, this tool includes several biological applications related to GO semantic similarity scores, including the retrieval of genes based on their GO annotations, the clustering of functionally related genes within a set, and term enrichment analysis.

Highlights

  • The use of Gene Ontology (GO) data in protein analyses have largely contributed to the improved outcomes of these analyses

  • During the last decade several Gene Ontology (GO) semantic similarity approaches [1,2,3,4,5,6,7,8,9,10] have been introduced for assessing the specificity of and relationship between GO terms based on their position in the GO Directed Acyclic Graph (DAG) [11,12,13]

  • Results and discussion we provide and discuss briefly some illustrations of biological applications included in the DaGO-Fun tool, namely GO Term Similarity based Protein-Fuzzy Identification Tool (GOSP-FIT), GO based Similarity Protein-Fuzzy Classification Tool (GOSP-FCT) and GO Semantic Similarity based-Fuzzy Enrichment Analysis Tool (GOSS-FEAT)

Read more

Summary

Introduction

The use of Gene Ontology (GO) data in protein analyses have largely contributed to the improved outcomes of these analyses. Information content based approaches, which rely on a numerical value to convey the description and specificity of a GO term using its position in the structure, were introduced [1] This numerical value is called information content (IC) or semantic value, and depending on the conception of the term IC, these approaches are divided into two main families, annotation-based and topology-based families. Those depending only on the intrinsic topology of the GO structure are referred to as topology-based approaches while those using the frequencies at which terms occur in the corpus under consideration are referred to as annotationbased approaches

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call