Semantic web data warehousing for caGrid.

Jamie P Mccusker,Alejandra González Beltrán,Michael Krauthammer,Joshua A Phillips,Anthony Finkelstein

doi:10.1186/1471-2105-10-s10-s2

Jamie P Mccusker, Alejandra González Beltrán + Show 3 more

Open Access

https://doi.org/10.1186/1471-2105-10-s10-s2

Copy DOI

Abstract

The National Cancer Institute (NCI) is developing caGrid as a means for sharing cancer-related data and services. As more data sets become available on caGrid, we need effective ways of accessing and integrating this information. Although the data models exposed on caGrid are semantically well annotated, it is currently up to the caGrid client to infer relationships between the different models and their classes. In this paper, we present a Semantic Web-based data warehouse (Corvus) for creating relationships among caGrid models. This is accomplished through the transformation of semantically-annotated caBIG® Unified Modeling Language (UML) information models into Web Ontology Language (OWL) ontologies that preserve those semantics. We demonstrate the validity of the approach by Semantic Extraction, Transformation and Loading (SETL) of data from two caGrid data sources, caTissue and caArray, as well as alignment and query of those sources in Corvus. We argue that semantic integration is necessary for integration of data from distributed web services and that Corvus is a useful way of accomplishing this. Our approach is generalizable and of broad utility to researchers facing similar integration challenges.Electronic supplementary materialThe online version of this article (doi:10.1186/1471-2105-10-S10-S2) contains supplementary material, which is available to authorized users.

Highlights

We propose a Semantic Web data warehouse approach that enables users to map data from multiple grid data sources into an ontologically-driven data store, or knowledge base (KB), where they can use data from a semantic perspective. caGrid, a core technology of caBIG® ("Cancer Biomedical Informatics Grid”) [1,2,3,4,5], is a semantically annotated grid sponsored by the National Cancer Institute that provides a consistent framework for grid web services
The information models of the grid services are mapped to concepts from the National Cancer Institute (NCI) Thesaurus (NCIt) [6,7,8,9], a rich, cancerfocused terminology source, through Common Data Elements (CDEs) registered in the Cancer Data Standards Repository [10]
We generated the OWL ontologies for caTissue, caArray, and the NCIt concepts they use from the metadata available from those services

Summary

Introduction

CaGrid, a core technology of caBIG® ("Cancer Biomedical Informatics Grid”) [1,2,3,4,5], is a semantically annotated grid sponsored by the National Cancer Institute that provides a consistent framework for grid web services. The information models of the grid services are mapped to concepts from the NCI Thesaurus (NCIt) [6,7,8,9], a rich, cancerfocused terminology source, through Common Data Elements (CDEs) registered in the Cancer Data Standards Repository (caDSR) [10]. CDEs represent semantically interoperable “join points” among information models, which provide a basis for data integration. There is no transparent mapping of semantics onto data from grid services. Semantic interoperability is the job of the client and requires the ability to reason over (or interpret) the metadata, including class hierarchies, attributes, associations, and their corresponding annotations to establish equivalencies

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Oct 1, 2009
Citations: 63	License type: cc-by

R Discovery Prime

R Discovery Prime

Semantic web data warehousing for caGrid.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Adapted Rules for UML Modelling of Geospatial Information for Model-Driven Implementation as OWL Ontologies
Knut Jetlund ... Erling Onstein
ISPRS International Journal of Geo-Information | VOL. 8
Knut Jetlund, et. al.Knut Jetlund ... Erling Onstein
22 Aug 2019
ISPRS International Journal of Geo-Information | VOL. 8

A STRUCTURE OF UML PROFILES FOR MODELLING OF GEOSPATIAL INFORMATION IN GIS, ITS AND BIM
K Jetlund
ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences | VOL. VI-4/W1-2020
K JetlundK Jetlund
03 Sep 2020
ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences | VOL. VI-4/W1-2020

Mapping UML Sequence Diagram into the Web Ontology Language OWL
Mo’Men Elsayed ... Nermeen Elkashef
International Journal of Advanced Computer Science and Applications | VOL. 11
Mo’Men Elsayed, et. al.Mo’Men Elsayed ... Nermeen Elkashef
01 Jan 2020
International Journal of Advanced Computer Science and Applications | VOL. 11

The caCORE Software Development Kit: Streamlining construction of interoperable biomedical information services
Joshua Phillips ... Gilberto Fragoso
BMC Medical Informatics and Decision Making | VOL. 6
Joshua Phillips, et. al.Joshua Phillips ... Gilberto Fragoso
06 Jan 2006
BMC Medical Informatics and Decision Making | VOL. 6

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Semantic web data warehousing for caGrid.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics