Abstract

Scientific workflow systems provide languages for representing complex scientific processes as decompositions into lower level tasks, down to the level of atomic, executable units. To support data analysis activities, a wide variety of such languages represent data transformation and processing operations as task nodes within a workflow. Adding data type information to the task inputs and outputs allows workflow authors to perform type checking at design time, search for compatible nodes in public component repositories and define specifications of abstract workflows. Introducing support for strict data typing simplifies the implementation of a workflow system in addressing these issues, but at the expense of losing flexibility. We address this challenge by introducing workflow type signatures suitable for use in registries and for type matching, and developing a polymorphic type inference over compositions of such signatures. The focus is on the relational data model, popular in data analysis workflow systems, and the techniques introduced are validated by applying the inference engine prototype to an adverse drug reaction study implemented in the relational algebra subset of the Discovery Net workflow system.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.