Abstract

The Protein Ontology (PRO; http://proconsortium.org) formally defines protein entities and explicitly represents their major forms and interrelations. Protein entities represented in PRO corresponding to single amino acid chains are categorized by level of specificity into family, gene, sequence and modification metaclasses, and there is a separate metaclass for protein complexes. All metaclasses also have organism-specific derivatives. PRO complements established sequence databases such as UniProtKB, and interoperates with other biomedical and biological ontologies such as the Gene Ontology (GO). PRO relates to UniProtKB in that PRO’s organism-specific classes of proteins encoded by a specific gene correspond to entities documented in UniProtKB entries. PRO relates to the GO in that PRO’s representations of organism-specific protein complexes are subclasses of the organism-agnostic protein complex terms in the GO Cellular Component Ontology. The past few years have seen growth and changes to the PRO, as well as new points of access to the data and new applications of PRO in immunology and proteomics. Here we describe some of these developments.

Highlights

  • Scientific discourse is predicated on mutual understanding, verification of information and the inference of new connections

  • Placing entities in an ontological context facilitates the ability to make connections over short and long ranges within and between networks. It is toward these goals that the Open Biological and Biomedical Ontologies (OBO) Foundry was created [1]

  • Protein Ontology (PRO) has been used to define dendritic and hematopoietic cell types [2,3], to describe biological processes, to flag protein entities mentioned in the literature [4] and to capture information isolated from the literature in text mining workflows [5,6]

Read more

Summary

Introduction

Scientific discourse is predicated on mutual understanding, verification of information and the inference of new connections. Each PRO term categorized as ‘family-level’ refers to the class of proteins translated from a specific set of ancestrally related genes. D416 Nucleic Acids Research, 2014, Vol 42, Database issue refers to the class of proteins translated from a different gene related by 1:1 orthology in distinct organisms.

Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call