Abstract
Data provenance captured from scientific applications is a critical precursor to data sharing and reuse. For researchers wanting to repurpose data, it is a source of information about the lineage and attribution of the data and this is needed in order to establish trust in a data set. Komadu is a standalone provenance capture and visualization system for capturing, representing, and manipulating provenance coming from scientific tools, infrastructures, and repositories. It uses the W3C PROV standard [1] in representing data, and it is the successor of the Karma [2] provenance capture system which was based on Open Provenance Model (OPM) [3]. Komadu comes with two different interfaces: a Web Services interface based on Apache Axis2 [4] and a messaging interface based on RabbitMQ [5]. Komadu is completely open source and the source code is publicly available on GitHub [6]. Even though Komadu has been used most extensively in relation to scientific research, its interfaces are designed to collect and visualize provenance of any kind of application needing provenance.
Highlights
We introduce Komadu [9] provenance capture and visualization system
It introduces the challenge to be handled within Komadu of stitching together graphs based on events that are not identifiable as being causally related
Komadu is backward compatible with Karma through graph generation that uses global context identifiers
Summary
Unlike Karma, the graph generation algorithm used in Komadu does not depend on any global context identifier. This makes it possible to collect provenance from disparate and unrelated pieces of infrastructure and application. It introduces the challenge to be handled within Komadu of stitching together graphs based on events that are not identifiable as being causally related. Komadu is backward compatible with Karma through graph generation that uses global context identifiers. A new user is advised to use the more convenient context-less mechanism
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have