Using Domain-Specific Data to Enhance Scientific Workflow Steering Queries

João Carlos De A.R Gonçalves,Daniel De Oliveira,Marta Mattoso,Kary A C S Ocaña,Eduardo Ogasawara

doi:10.1007/978-3-642-34222-6_12

João Carlos De A.R Gonçalves, Daniel De Oliveira + Show 3 more

Open Access

https://doi.org/10.1007/978-3-642-34222-6_12

Copy DOI

Abstract

In scientific workflows, provenance data helps scientists in understanding, evaluating and reproducing their results. Provenance data generated at runtime can also support workflow steering mechanisms. Steering facilities for workflows is considered a challenge due to its dynamic demands during execution. To steer, for example, scientists should be able to suspend (or stop) a workflow execution when the approximate solution meets (or deviates) preset criteria. These criteria are commonly evaluated based on provenance data (execution data) and domain-specific data. We claim that the final decision on whether to interfere on the workflow execution may only become feasible when workflows can be steered by scientists using provenance data enriched with domain-specific data. In this paper we propose an approach based on specialized software components, named Data Extractor (DE), to acquire domain-specific data from data files produced during a scientific workflow execution. DE gathers domain-specific data from produced data files and associates it to existing provenance data on the provenance repository. We have evaluated the proposed approach using a real bioinformatics workflow for comparative genomics executed in SciCumulus cloud workflow parallel engine.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Using Domain-Specific Data to Enhance Scientific Workflow Steering Queries

Abstract

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2012
Citations: 37	License type: other-oa

Similar Papers

A Practical Roadmap for Provenance Capture and Data Analysis in Spark-Based Scientific Workflows
Thaylon Guedes ... Marta Mattoso
-
Thaylon Guedes, et. al.Thaylon Guedes ... Marta Mattoso
01 Nov 2018
01 Nov 2018

Data Analytics in Bioinformatics: Data Science in Practice for Genomics Analysis Workflows
Kary A C S Ocaña ... Daniel De Oliveira
-
Kary A C S Ocaña, et. al.Kary A C S Ocaña ... Daniel De Oliveira
01 Aug 2015
01 Aug 2015

A data dependency based strategy for intermediate data storage in scientific cloud workflow systems
Dong Yuan ... Jinjun Chen
Concurrency and Computation: Practice and Experience | VOL. 24
Dong Yuan, et. al.Dong Yuan ... Jinjun Chen
27 Aug 2010
Concurrency and Computation: Practice and Experience | VOL. 24

An autonomous blockchain-based workflow execution broker for e-science
Alper Alimoğlu ... Can Özturan
Cluster Computing | VOL. 27
Alper Alimoğlu, et. al.Alper Alimoğlu ... Can Özturan
15 May 2024
Cluster Computing | VOL. 27

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Using Domain-Specific Data to Enhance Scientific Workflow Steering Queries

Abstract

Talk to us

Similar Papers