AbstractThe Cell Cycle Ontology (CCO) is an application ontology that automatically captures and integrates detailed knowledge on the cell cycle process by combining, interlinking and enriching knowledge from various sources. CCO uses Semantic Web technologies, and it is accessible via the web for browsing, visualising, advanced querying, and computational reasoning. CCO facilitates a detailed analysis of cell cycle related molecular network components. Through querying and automated reasoning, it may provide new hypotheses to help steer a systems biology approach to biological network building. The ontology is available on "http://www.cellcycleontology.org":http://www.cellcycleontology.org. Visual exploration can be done via the BioPortal, the Ontology Lookup Service, the Ontology Online service, or the DIAMONDS platform.*The Cell Cycle Ontology*The Cell Cycle Ontology captures detailed information (in terms and relationships) of the cell cycle process by combining representations from several, public sources.1 CCO supports four model organisms (H. sapiens, A. thaliana, S. pombe and S. cerevisiae) with separate ontologies and one integrated ontology. It is an application ontology that is supplied as an integrated turnkey system for exploratory analysis, advanced querying, and automated reasoning.CCO holds more than 13,000 concepts and 30 types of relationships. CCO comprises data from existing resources such as the Gene Ontology (GO), the Relations Ontology (RO), the IntAct database (MI), the NCBI taxonomy, the UniProt Knowledge Base as well as orthology data. An automatic pipeline builds CCO from scratch periodically: initially some existing ontologies (GO, RO, MI, in-house ones) are automatically fetched, integrated and merged, producing a core cell cycle ontology. Then, organism-specific protein and gene data are added from UniProt and from the GO Annotation files, generating four organism-specific ontologies. Those four ontologies are merged and more terms are included from an ontology built automatically from the OrthoMCL execution on the cell cycle proteins.*Formats and queries*CCO is built in the OBOF format with ONTO-PERL and exported to other formats later.2 CCO is available in: OBOF, RDF, XML, OWL, GML, and DOT. The Semantic Web formats RDF and OWL allow queries on CCO. In a SPARQL endpoint complex queries on the RDF format can be formulated, such as “retrieve all the core cell cycle proteins in S. cerevisiae that are located in the cytoplasm and that have a hydrolysisrelated function”.Relational closures are pre-inferenced in the RDF triple store, by operating SPARUL update queries over CCO and Metarel. This allows for very simple and responsive queries over long chains of relations in CCO.Finally, during the maintenance phase, a semantic improvement on the OWL version is carried out: Ontology Design Patterns are included using the Ontology Pre-Processor Language. The resulting CCO is designed to provide a richer view of the cell cycle regulatory process, in particular by accommodating the intrinsic dynamics of this process.*References*1. Antezana E, Egaña M, Blondé W et al. The Cell Cycle Ontology: An application ontology for the representation and integrated analysis of the cell cycle process, Genome Biology, 2009, 10:52. Antezana E, Egaña M, De Baets B, Kuiper M, Mironov V. ONTO-PERL: an api supporting the development and analysis of bio-ontologies. Bioinformatics, 2008, pp. 885–887.
Read full abstract