Abstract

The Gene Ontology (GO) is an important component of modern biological knowledge representation with great utility for computational analysis of genomic and genetic data. The Gene Ontology Consortium (GOC) consists of a large team of contributors including curation teams from most model organism database groups as well as curation teams focused on representation of data relevant to specific human diseases. Key to the generation of consistent and comprehensive annotations is the development and use of shared standards and measures of curation quality. The GOC engages all contributors to work to a defined standard of curation that is presented here in the context of annotation of genes in the laboratory mouse. Comprehensive understanding of the origin, epistemology, and coverage of GO annotations is essential for most effective use of GO resources. Here the application of comparative approaches to capturing functional data in the mouse system is described.

Highlights

  • The Gene Ontology (GO, The Gene Ontology Consortium 2000, 2015) provides a structured, controlled vocabulary used by a wide range of biological knowledge bases to create annotations that describe a gene product’s function, the overall biological objective of the function, and the cellular location where the function occurs

  • Arabidopsis thaliana (The Arabidopsis Information Resource (TAIR)) Caenorhabditis elegans (WormBase) Danio rerio (zebrafish; Zebrafish Model Organism Database (ZFIN)) Dictyostelium discoideum Drosophila melanogaster (FlyBase) Escherichia coli (PortEco) Gallus gallus (AgBase) Homo sapiens Mus musculus (Mouse Genome Informatics) Rattus norvegicus (Rat Genome Database (RGD)) Saccharomyces cerevisiae (Saccharomyces Genome Database (SGD)) Schizosaccharomyces pombe (Pombase) annotate genes from uncharacterized species based on the experimental work that has been done, wherever it may fall within the phylogenetic tree, often allowing use of more specific GO terms than are generated using some of the other annotation transfer pipelines

  • Mouse Genome Database (MGD), as a representative member of the Gene Ontology Consortium (GOC), uses a variety of annotation strategies to provide the best possible annotation set for mouse genes and to contribute to the annotation of the other reference genomes

Read more

Summary

Introduction

The Gene Ontology (GO, The Gene Ontology Consortium 2000, 2015) provides a structured, controlled vocabulary used by a wide range of biological knowledge bases to create annotations that describe a gene product’s function, the overall biological objective of the function, and the cellular location where the function occurs. Arabidopsis thaliana (The Arabidopsis Information Resource (TAIR)) Caenorhabditis elegans (WormBase) Danio rerio (zebrafish; Zebrafish Model Organism Database (ZFIN)) Dictyostelium discoideum (dictyBase) Drosophila melanogaster (FlyBase) Escherichia coli (PortEco) Gallus gallus (AgBase) Homo sapiens (human UniProtKB-Gene Ontology Annotation [UniProtKB-GOA] @ EBI) Mus musculus (Mouse Genome Informatics) Rattus norvegicus (Rat Genome Database (RGD)) Saccharomyces cerevisiae (Saccharomyces Genome Database (SGD)) Schizosaccharomyces pombe (Pombase) annotate genes from uncharacterized species based on the experimental work that has been done, wherever it may fall within the phylogenetic tree, often allowing use of more specific GO terms than are generated using some of the other annotation transfer pipelines. A tool for categorizing a gene set according to a set of high-level GO terms, a ‘GO slim’

A GO Term Finder type tool with a graphical output
Findings
Summary
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call