Abstract

Computational workflows describe the complex multi-step methods that are used for data collection, data preparation, analytics, predictive modelling, and simulation that lead to new data products. They can inherently contribute to the FAIR data principles: by processing data according to established metadata; by creating metadata themselves during the processing of data; and by tracking and recording data provenance. These properties aid data quality assessment and contribute to secondary data usage. Moreover, workflows are digital objects in their own right. This paper argues that FAIR principles for workflows need to address their specific nature in terms of their composition of executable software steps, their provenance, and their development.

Highlights

  • In data intensive science, e-infrastructures and software tool-chains are heavily used to help scientists manage, analyze, and share increasing volumes of complex data [1]

  • Computational workflows describe the complex multi-step methods that are used for data collection, data preparation, analytics, predictive modelling, and simulation that lead to new data products

  • This paper argues that FAIR principles for workflows need to address their specific nature in terms of their composition of executable software steps, their provenance, and their development

Read more

Summary

INTRODUCTION

E-infrastructures and software tool-chains are heavily used to help scientists manage, analyze, and share increasing volumes of complex data [1]. At the other end are Workflow Management Systems (WfMS) that provide http://commonwl.org/. WfMS may execute over HPC or geographically distributed clusters, cloud environments across systems, or even from desktops They vary in their mechanisms to prepare their components to become executable steps and must manage portability and dependencies on the infrastructure used to run them. Capturing the control flow order between components explicitly exposes the dataflow and data dependencies between the inputs and outputs of the processing steps. This explicit separation is fundamental to supporting workflow comprehension, design modularity, workflow comparisons and alternative execution strategies. References to FAIR principles [23] are given in brackets

FAIR DATA FOR AND FROM WORKFLOWS
FAIR CRITERIA FOR WORKFLOWS AS DIGITAL OBJECTS
CONCLUSIONS
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call