Summary Integration of information is quintessential to make use of the wealth of bioinformatics resources. One aspect of integration is to make databases interoperable through well annotated information. With new databases one strives to store complementary information and such results in collections of heterogeneous information systems. Concepts in these databases need to be connected and ontologies typically provide a common terminology to share information among different resources.Our focus of research is the zebrafish and we have developed several information systems in which ontologies are crucial. Pivot is an ontology describing the developmental anatomy, referred to as the Developmental Anatomy Ontolgoy of Zebrafish (DAOZ). The anatomical and temporal concepts are provided by the Zebrafish Information Network (ZFIN) and proven within the research community. We have constructed a 3D digital atlas of zebrafish development based on histology; the atlas is series of volumetric models; in each instance, every volume element is assigned to an anatomical term. Complementing the atlas we developed an information system with 3D patterns of gene expression in zebrafish development based on marker genes. The spatial and temporal annotations to these 3D images are drawn from the ontology that we have designed. In its design the DAOZ ontology is structured as a Directed Acyclic Graph (DAG). Such is required to find unique concept paths and prevent self referencing.As we need to address the ontology in a direct manner, the DAG structure is transferred to a database. The database is used in the integration of our databases that share concepts at different levels of aggregation. In order to make sure that sufficient levels of aggregation for applications in mind are present, the original vocabulary was enriched with more relations and concepts. Both databases can now be addressed with the same unique terms and co-occurrence and co-expression of genes can be readily extracted from the databases. Integration can be further extended to the ZFIN resource and also by including ontologies that relate to gene/gene expression (e.g. Gene Ontology). In this manner, interoperable information retrieval from heterogeneous databases can be realized. This greatly facilitates processing complex information and retrieving relations in the data through machine learning approaches.
Read full abstract