Provenance, the metadata that records the derivation history of scientific results, is essential in scientific workflows to support the reproducibility of scientific discovery, result interpretation, and problem diagnosis. To promote and facilitate interoperability among heterogeneous provenance systems, the Open Provenance Model (OPM) was first proposed in 2008 and since then has played an important role in the community. In this paper, we present OPMProv, a relational database-based scientific workflow provenance system, that is compliant with OPM (v1.1). Our main contributions are: (i) we design an entity–relationship diagram for OPM and translate it into a relational database schema for the storage of provenance; (ii) we show that provenance reasoning defined in OPM (v1.1) can be sufficiently supported by OPMProv using recursive views and SQL queries alone without any additional reasoning engine. Experiments are conducted to evaluate the performance of OPMProv in data insertion and provenance querying. A case study is performed, demonstrating that OPMProv can answer all except two queries out of the 16 queries defined in the Third Provenance Challenge.
Read full abstract