Abstract

Translational research in Life-Science nowadays leverages e-Science platforms to analyse and produce huge amounts of data. With the unprecedented growth of Life-Science data repositories, identifying relevant data for analysis becomes increasingly difficult. The instrumentation of e-Science platforms with provenance tracking techniques provide useful information from a data analysis process design or debugging perspective. However raw provenance traces are too massive and too generic to facilitate the scientific interpretation of data. In this paper, we propose an integrated approach in which Life-Science knowledge is (i) captured through domain ontologies and linked to Life-Science data analysis tools, and (ii) propagated through rules to produced data, in order to constitute human-tractable experiment summaries. Our approach has been implemented in the Virtual Imaging Platform and experimental results show the feasibility of producing few domain-specific statements which opens new data sharing and repurposing opportunities in line with Linked Data initiatives.

Highlights

  • IntroductionDigital Life-Science data, ranging from molecular scale (e.g. proteins structural information) to humanbody scale (e.g. radiological images) and including records as diverse as biological samples, epidemiological data, and clinical information, is acquired using many kinds of sensors

  • Digital Life-Science data, ranging from molecular scale to humanbody scale and including records as diverse as biological samples, epidemiological data, and clinical information, is acquired using many kinds of sensors

  • Experimental setup The Virtual Imaging Platform (VIP) simulation platform hosts a semantic catalog of organ models which associates the set of raw source files with the set of semantic annotations describing each model

Read more

Summary

Introduction

Digital Life-Science data, ranging from molecular scale (e.g. proteins structural information) to humanbody scale (e.g. radiological images) and including records as diverse as biological samples, epidemiological data, and clinical information, is acquired using many kinds of sensors. To enable the reuse of (and possibly to repurpose) data in future studies, it is critical for e-Science platforms to keep track of the links between source data, produced data, and annotations associated either to the source data or the transformation process itself This data provenance information facilitates data reinterpretation, data quality assessment, data processing validation, debugging, experiment reproducibility, scientific outcomes ownership control, etc. The first objective of this work is to instrument data processing tools with domain-specific information describing both the kind of data processed and the data transformation process implemented (see Section 4) Based on this captured knowledge, the second objective of this work is to analyse the dense provenance traces generated, combined with the tools and source data annotations, to produce experiment summaries which are both human-tractable and informative for scientists (see Section 5)

Objectives
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call